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In  this  dissertation,  reduced-order  models  for  a large  class  of  nonlinear  dynamical 
systems  are  derived  in  terms  of  Volterra  series.  In  a Volterra  series  representation, 
the  output  of  a dynamical  system  is  expressed  in  terms  of  an  infinite  series  of  integral 
operators.  Each  Volterra  operator  has  an  associated  kernel,  and  these  kernels  com- 
pletely characterize  the  system  dynamics.  In  this  study,  wavelets,  which  have  proven 
effective  in  compressing  certain  types  of  integral  operators,  are  used  to  obtain  low- 
order  estimates  of  first  and  second-order  Volterra  kernels  for  time-invariant  systems. 
Two  wavelet  bases  are  constructed  for  the  representation  of  Volterra  kernels.  First, 
piecewise-polynomial  multiwavelets  are  constructed  from  classical  finite  element  ba- 
sis functions  using  the  technique  of  intertwining.  These  multiwavelets  are  used  to 
approximate  the  first-order,  linear  kernel  and  the  symmetric  form  of  the  second-order 
kernel.  The  second  wavelet  basis  consists  of  triangular  wavelets  that  are  constructed 
over  a triangular  domain.  These  wavelets  are  suitable  for  the  representation  of  the 
triangular  form  of  the  second-order  kernel.  In  this  implementation,  the  Haar  wavelet 
is  used  to  approximate  the  first-order  kernel. 
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The  wavelet-based  kernel  identification  algorithms  are  validated  for  both  a linear 
oscillator  and  a quadratic  nonlinear  system.  The  dynamics  of  these  two  systems  can 
be  completely  described  in  terms  of  the  first  and  second-order  kernels,  respectively. 
For  these  simple  examples,  it  is  demonstrated  that  relatively  accurate,  low-order  es- 
timates of  the  analytical  kernels  can  be  obtained.  First  and  second-order  kernels  are 
then  identified  for  a simulated  nonlinear  oscillator  with  a polynomial  nonlinearity. 
It  is  demonstrated  that  the  identified  second-order  kernels  are  capable  of  predicting 
the  nonlinear  dynamics  for  a bounded  range  of  input  amplitudes.  For  inputs  exceed- 
ing this  amplitude  bound,  higher-order  kernels  are  needed  to  model  the  nonlinear 
response.  Finally,  first-order  kernels  are  extracted  from  flight  flutter  data  from  the 
Aerostructures  Test  Wing.  This  example  demonstrates  the  ability  of  the  wavelet- 
based  algorithms  to  extract  first-order  kernels  from  noisy  flight  data,  and  a practical 
application  of  these  kernels  to  flutter  analysis  is  described. 


CHAPTER  1 
INTRODUCTION 


The  development  of  models  for  nonlinear  dynamical  systems  is  of  critical  im- 
portance in  many  engineering  fields,  such  as  structural  dynamics,  fluid  mechanics, 
aeroelasticity,  biomedical  engineering,  signal  processing,  and  control  theory.  Al- 
though linear  models  are  often  obtained  through  the  linearization  of  a nonlinear 
system  about  some  nominal  condition,  it  is  often  necessary  to  generate  models  that 
include  nonlinear  effects.  This  is  the  case  with  many  aeroservoelastic  systems,  for 
example,  in  which  nonlinearities  arising  from  the  interaction  of  the  structure,  airflow, 
and  flight  control  system  can  have  a significant  effect.  Sometimes,  as  is  often  the  case 
in  biomedical  applications,  nonlinear  models  are  desired  in  order  to  gain  a better 
physical  understanding  of  the  system.  In  other  cases,  such  as  in  many  electrical  engi- 
neering applications,  the  purpose  of  the  models  is  to  compensate  for  nonlinear  effects 
such  as  signal  distortion.  In  many  cases,  nonlinear  models  are  sought  for  the  purpose 
of  control  design.  Due  to  the  computational  costs  of  developing  and  implementing 
nonlinear  models,  it  is  desirable  to  obtain  reduced-order  models  whenever  possible. 
This  is  especially  true  for  the  design  of  online  control  systems  where  low-order  mod- 
els are  essential.  The  main  objective  of  this  dissertation  is  to  develop  reduced-order 
models  that  are  suitable  for  the  representation  of  a large  class  of  nonlinear  dynamical 
systems. 

In  this  dissertation,  nonlinear  models  for  dynamical  systems  are  generated  in 
the  form  of  Volterra  series  representations.  This  approach  utilizes  the  Volterra  the- 
ory of  nonlinear  systems,  which  states  that  the  output  of  a nonlinear  system  can  be 
expressed  in  terms  of  an  infinite  sum  of  integral  operators  of  increasing  order.  Each 
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of  these  operators  has  an  associated  kernel,  and  a given  dynamical  system  is  com- 
pletely characterized  by  these  kernels.  Therefore,  the  problem  of  system  identification 
becomes  one  of  extracting  these  Volterra  kernels  from  input/output  data  from  the 
system.  In  practice,  this  is  a formidable  task  due  to  the  large  dimensions  of  the  kernels 
of  second  order  and  higher.  In  general,  for  an  IV-dimensional  input/output  data  set, 
the  identification  of  the  rth-order  Volterra  kernel  requires  solving  for  Nr  coefficients. 
The  approach  detailed  in  this  dissertation  uses  wavelets  and  multiresolution  analysis 
to  generate  reduced-order  representations  of  Volterra  kernels.  Wavelets  have  proven 
to  be  very  effective  in  compressing  certain  integral  operators  [1].  A wavelet  basis  is 
composed  of  the  scaled  translates  and  dilates  of  a single,  compactly-supported  wavelet 
function.  Therefore,  using  wavelets  and  multiresolution  analysis,  the  Volterra  kernels 
can  be  decomposed  in  terms  of  functions  that  are  localized  in  the  time  and  frequency 
domains.  Often,  many  of  the  coefficients  in  a wavelet  expansion  are  very  small  and 
can  be  neglected,  leading  to  a relatively  sparse  representation.  There  are  numerous 
wavelet  families  from  which  to  choose,  each  with  its  own  unique  set  of  properties. 
One  of  the  difficulties  associated  with  using  wavelets  to  represent  Volterra  kernels  is 
that  many  wavelets  are  not  easily  adapted  to  the  boundaries  over  which  the  kernels 
are  supported.  A large  portion  of  this  dissertation  is  dedicated  to  the  fundamentals  of 
wavelets  and  the  design  of  wavelets  that  are  suitable  for  kernel  identification.  In  par- 
ticular, the  derivation  of  piecewise-polynomial  multiwavelets  and  the  construction  of 
wavelets  that  are  supported  on  a triangular  domain  are  discussed  in  detail.  The  mul- 
tiwavelets are  easily  adapted  to  the  square  domain  over  which  the  symmetric  form  of 
the  second-order  kernel  is  supported,  while  the  triangular  wavelets  are  well-adapted 
to  the  region  of  support  for  the  triangular  form  of  the  second-order  kernel.  This 
dissertation  focuses  on  the  identification  of  first  and  second-order  Volterra  kernels 
for  time- invariant,  single-input/single-output  systems.  In  many  cases,  the  first  and 
second-order  kernels  alone  are  sufficient  to  characterize  the  system,  so  the  approach 
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taken  here  is  applicable  to  a large  class  of  dynamical  systems.  The  wavelet-based 
kernel  identification  algorithms  are  demonstrated  on  several  dynamical  systems  for 
which  the  kernels  have  a known,  analytical  form.  In  addition,  they  have  been  im- 
plemented in  the  identification  of  Volterra  kernels  from  flight  flutter  data  taken  from 
the  Aerostructures  Test  Wing  (ATW)  at  NASA  Dryden. 

In  the  remainder  of  this  chapter,  a brief  introduction  and  review  of  literature  is 
given  for  Volterra  theory  and  wavelets.  In  Chapter  2,  the  fundamentals  of  Volterra 
series  representations  are  discussed  in  detail.  This  chapter  includes  a discussion  of  the 
general  properties  of  Volterra  series,  the  first-order  (linear)  and  higher-order  (nonlin- 
ear) Volterra  kernels,  and  the  relationship  between  Volterra  series  and  ordinary  differ- 
ential equations  (ODEs).  Chapter  3 is  a review  of  wavelets  and  multiresolution  analy- 
sis. An  introduction  to  wavelets  is  given,  and  wavelets  are  compared  to  two  of  the 
most  popular  tools  for  the  approximation  of  functions  in  engineering  practice,  Fourier 
analysis  and  the  finite  element  method  (FEM).  The  Haar  wavelet  is  discussed  in  detail 
as  a prototype  wavelet  for  which  many  wavelet  properties  are  easy  to  visualize.  The 
discussion  is  then  extended  to  orthonormal  wavelet  families  and,  even  more  generally, 
orthonormal  multiwavelets.  Finally,  two-dimensional  tensor  product  wavelets  are  in- 
troduced as  a means  for  approximating  two-dimensional  functions  or  signals.  Chapter 
4 is  devoted  to  the  construction  of  orthonormal,  piecewise-polynomial  multiwavelets. 
The  method  of  intertwining  multiresolution  analyses,  developed  by  Donovan,  Geron- 
imo,  and  Hardin  [2,3],  is  used  to  construct  piecewise-quadratic  and  piecewise-cubic 
multiwavelets.  It  is  demonstrated  that  this  technique  can  be  used  to  generate  mul- 
tiwavelets of  arbitrary  approximation  order  from  the  classical  finite  element  basis 
functions.  In  Chapter  5,  the  method  of  constructing  wavelets  over  two-dimensional, 
finite  sets  given  by  Micchelli  and  Xu  [4,  5]  is  used  to  create  wavelets  that  are  sup- 
ported over  a triangular  domain.  This  domain  corresponds  to  the  region  over  which 
the  triangular  form  of  the  second-order  Volterra  kernel  is  supported. 
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The  first  five  chapters  constitute  the  background  needed  to  apply  wavelets  to 
the  identification  of  Volterra  kernels  for  dynamical  systems.  In  this  dissertation,  two 
different  wavelet  bases  are  used  for  approximating  first  and  second-order  Volterra 
kernels.  The  piecewise-polynomial  multiwavelets  derived  in  Chapter  4 are  used  to 
identify  the  first-order  kernel  and  the  symmetric  form  of  the  second-order  kernel, 
which  is  supported  on  a square  domain.  The  triangular  wavelets  constructed  in 
Chapter  5 are  used  to  identify  the  triangular  form  of  the  second-order  kernel.  The 
triangular  wavelets  are  piecewise-constant  in  form  and  are  a two-dimensional  analog 
to  the  Haar  wavelet.  Therefore,  the  Haar  wavelet  and  the  triangular  wavelet  basis 
are  used  together  to  identify  the  first  and  second-order  kernels,  respectively.  These 
wavelet-based  kernel  identification  algorithms  are  described  in  detail  in  Chapter  6.  A 
survey  of  different  approaches  to  kernel  identification  is  given  in  Section  6.1.  Then, 
the  two  wavelet  implementations  are  derived  in  Sections  6.2  and  6.3.  In  both  cases, 
a matrix  formulation  is  obtained  whereby  the  wavelet  coefficients  that  represent  the 
kernels  must  be  solved  for  in  a least-squares  sense.  This  problem  is  frequently  ill- 
posed,  however,  since  kernel  identification  constitutes  an  inverse  problem  in  that  the 
system  is  identified  from  input/output  data.  Therefore,  regularization  techniques 
such  as  the  truncated  singular  value  decomposition  are  often  needed  to  obtain  stable 
kernel  estimates.  This  is  described  in  Section  6.4.  Section  6.5  then  elaborates  on  some 
of  the  specific  details  and  issues  involved  with  the  implementation  of  the  wavelet-based 
kernel  identification  algorithms.  Finally,  in  Chapter  7,  these  algorithms  are  used  to 
extract  Volterra  kernels  from  input/output  data  from  a number  of  dynamical  systems. 
First,  a linear  oscillator  is  considered  in  Section  7.1.  Then,  a nonlinear  system  that 
can  be  described  in  terms  of  the  second-order  kernel  alone  is  analyzed  in  order  to 
validate  the  second-order  part  of  the  kernel  identification  algorithms.  In  Section  7.3, 
first  and  second-order  kernels  are  identified  for  a nonlinear  oscillator.  The  ability  of 
the  estimated  kernels  to  predict  the  outputs  corresponding  to  a number  of  different 
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inputs  is  then  evaluated.  Section  7.4  gives  a practical  application  of  the  wavelet-based 
implementations  as  Volterra  kernels  are  extracted  from  flight  flutter  data  taken  from 
the  Aerostructures  Test  Wing  (ATW).  The  ATW  was  a small-scale  wing  that  was 
developed  at  the  NASA  Dryden  Flight  Research  Center  to  study  flutter,  an  aeroelastic 
instability.  It  is  shown  that  the  first-order  kernel  identified  from  the  flight  data  can 
be  incorporated  into  the  flutterometer  [6],  a flutter  prediction  tool,  in  order  to  obtain 
more  accurate  flutter  estimates.  Finally,  conclusions  and  ideas  for  future  research  are 
discussed  in  Chapter  8. 


1.1  Volterra  Series  Representations 

The  Volterra  theory  of  nonlinear  systems  states  that  the  output  of  a nonlinear 
dynamical  system  can  be  represented  by  an  infinite  series  of  integral  operators  of 
increasing  order.  Each  operator  has  an  associated  kernel,  and  these  kernels  form  the 
model  for  a given  system.  The  first-order  kernel  represents  linear  dynamics  and,  for 
a linear  system,  is  equivalent  to  the  impulse  response  of  the  system.  The  first-order 
operator  is  simply  a convolution  of  the  system  input  and  the  first-order  kernel.  In 
formulating  his  theory,  Volterra  [7]  basically  extended  the  concept  of  convolution 
to  higher  dimensions.  The  resulting  operators  are  capable  of  describing  nonlinear 
dynamics  of  varying  order.  Just  as  the  first-order  kernel  can  be  interpreted  as  an 
impulse  response,  the  higher-order  kernels  can  be  related  to  the  response  of  the  system 
to  multiple  impulses  applied  at  different  times.  It  has  been  noted  by  many  researchers 
that  the  Volterra  series  can  be  viewed  as  a polynomial  representation  for  systems  with 
memory,  or  systems  for  which  the  output  is  influenced  by  past  inputs.  As  discussed 
in  detail  by  Boyd  et  al.  [8],  Volterra  series  representations  are  appropriate  for  systems 
with  fading  memory;  that  is,  systems  for  which  the  influence  of  an  input  applied  at  a 
given  time  fades  in  a finite  period  of  time.  Ku  and  Wolf  [9]  determined  many  of  the 
fundamental  convergence  properties  of  Volterra  series,  which  have  also  been  addressed 
by  Boyd  et  al.  [10].  Sandberg  [11]  added  to  these  results  in  addressing  the  convergence 
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of  the  “doubly  finite”  Volterra  series,  which  corresponds  to  the  case  where  not  only  the 
series  is  truncated  but  also  the  kernels  are  assumed  to  have  truncated  forms.  Fliess 
et  al.  [12]  also  provided  several  important  analytical  results,  such  as  the  derivation 
of  analytical  expressions  for  the  Volterra  kernels  that  appear  in  the  representation  of 
ordinary  differential  equations  (ODEs).  Wiener  [13]  developed  an  orthogonalization 
of  the  Volterra  series,  and  many  researchers  model  dynamical  systems  in  terms  of  the 
resulting  Wiener  kernels.  There  are  many  texts  available  that  discuss  the  Volterra 
theory  in  detail,  including  works  by  Boyd  [14],  Rugh  [15],  and  Schetzen  [16]. 

Volterra  series  representations  have  been  used  to  model  nonlinear  systems  for  a 
diverse  array  of  engineering  applications.  For  many  systems  of  interest,  linear  models 
are  not  sufficient  to  characterize  the  dynamics.  Volterra  series  provide  a convenient 
framework  for  describing  the  nonlinear  behavior  of  these  systems.  In  many  cases,  the 
purpose  of  these  nonlinear  models  is  to  gain  a better  fundamental  understanding  of 
the  system  and  to  be  able  to  predict  the  system  response  to  arbitrary  excitations. 
Volterra  series  have  been  used  extensively  in  the  field  of  biomedical  engineering,  for 
example,  to  model  various  physiological  systems.  Volterra  models  have  been  employed 
to  predict  the  responses  of  mechanoreceptors,  which  convey  sensory  input  signals  to 
the  central  nervous  systems  of  vertebrates  and  some  invertebrates  [17],  [18],  [19]. 
Developing  models  for  mechanoreceptors,  which  exhibit  complex  nonlinear  behavior, 
is  important  in  order  to  better  understand  the  workings  of  skeletal  motor  control 
systems  and  may  be  useful  in  the  design  of  machines  that  can  interact  with  the 
physical  world.  Volterra  models  have  also  been  used  to  model  sensory  systems  that 
exhibit  nonlinear  feedback,  such  as  auditory  and  visual  systems  [20].  Such  models 
may  be  able  to  explain  certain  pathologies  such  as  tinnitus,  or  ringing  in  the  ears. 
Other  applications  of  Volterra  series  in  the  biomedical  field  include  characterizing  the 
nonstationary  behavior  of  the  rabbit  hippocampus  [21],  describing  the  viscoelastic 
response  of  the  human  lung  [22] , modeling  the  renal  autoregulation  of  blood  flow  in 
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rats  [23],  [24],  [25],  and  developing  constitutive  equations  for  soft  tissues  under  axial 
tension  [26]. 

Within  the  field  of  aerospace  engineering,  Volterra  series  have  recently  been 
employed  in  the  study  of  aeroelasticity,  the  interaction  of  the  wing  structure  of  an 
aircraft  with  the  surrounding  airflow.  Silva  [27],  [28]  identified  Volterra  kernels  from 
computational  fluid  dynamics  (CFD)  simulations  in  order  to  model  the  response  of 
an  aircraft  to  unsteady  aerodynamics  at  transonic  speeds.  In  aeroelasticity,  much 
attention  is  directed  towards  the  prediction  of  flutter,  a linear  structural  instability 
that  occurs  when  two  modes  of  vibration  interact  in  such  a manner  that  one  of 
the  modes  becomes  undamped.  The  result  is  that  any  small  disturbance  from  the 
airflow  causes  unstable  vibration  of  the  wing,  which  is  usually  catastrophic.  Several 
researchers  have  recently  applied  the  Volterra  theory  to  predict  the  onset  of  flutter 
and  study  aeroelastic  responses  [29],  [30],  [31],  [32],  [33],  [34],  Limit  cycle  oscillation 
(LCO)  is  a nonlinear  aeroelastic  phenomenon  that  leads  to  periodic  vibrations  of 
the  wing.  LCO  is  often  not  catastrophic,  but  can  result  in  unnecessary  fatigue  of 
the  aircraft  or  render  an  aircraft  incapable  of  performing  its  mission.  LCO  is  less 
understood  than  flutter,  and  researchers  are  just  beginning  to  explore  the  modeling 
of  LCO  with  Volterra  series  [32],  [33]. 

There  are  many  other  applications  in  which  Volterra  models  are  used  to  obtain 
a basic  understanding  of  physical  systems.  Volterra  series  have  been  used  to  solve 
inverse  scattering  problems,  which  entail  the  determination  of  the  structures  of  un- 
known objects  through  the  computerized  processing  of  diffracted  waves  [35],  [36]. 
Inverse  scattering  problems  arise  in  such  diverse  applications  as  crystal  structure 
determination,  medical  ultrasound  tomography,  underground  imaging  problems,  and 
searching  for  dinosaur  bones  in  the  New  Mexico  desert.  Volterra  series  have  also  been 
applied  for  velocity  estimation,  which  is  essential  in  the  design  of  motion  detectors 
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for  computer  vision  [37].  In  their  text,  Worden  and  Tomlinson  [38]  discuss  the  appli- 
cation of  Volterra  theory  to  modeling  nonlinearities  in  structural  dynamics,  focusing 
on  systems  such  as  nonlinear  clamped  beams  and  automotive  shock  absorbers.  As  a 
final  example,  in  the  held  of  ocean  engineering,  Volterra  models  have  been  used  in 
the  analysis  of  wave  loading  on  offshore  structures  [39]. 

In  the  above  examples,  Volterra  series  were  used  to  gain  an  understanding  of  the 
behavior  of  various  nonlinear  systems.  In  many  applications,  the  presence  of  nonlin- 
earities signifies  a problem  that  must  be  alleviated.  This  is  often  the  case  in  structural 
dynamics,  where,  as  an  example,  the  nonlinear  response  of  a beam  can  be  indicative  of 
a fatigue  crack  [38].  Aiordachioaie  et  al.  [40]  used  Volterra  kernels  to  classify  various 
forms  of  nonlinearities,  such  as  dead  zone,  hysteresis,  saturation,  and  quantization, 
for  the  purpose  of  fault  detection  and  compensation.  In  electrical  engineering  appli- 
cations, signals  are  often  distorted  due  to  nonlinear  effects.  A great  deal  of  research 
in  electrical  engineering  has  been  directed  towards  the  development  of  Volterra  filters 
that  model  and  compensate  for  these  nonlinearities.  Volterra  filters  have  been  de- 
signed for  the  cancellation  of  various  forms  of  noise  such  as  atmospheric,  underwater 
acoustic,  and  urban  vehicle  noise  [41].  Volterra  models  have  been  employed  in  the 
optimal  design  of  fiber-optic  communications  systems,  which  are  frequently  subject  to 
signal  interference  due  to  fiber  nonlinearities  [42].  In  many  wireless  communications 
systems,  including  microwave  systems,  nonlinearities  in  the  circuits  lead  to  inter- 
modulation distortion,  which  corrupts  the  desired  signals.  Volterra  filters  have  been 
used  to  characterize  and  attenuate  this  distortion  [43],  [44]  as  well  as  the  distortion 
that  can  occur  due  to  temperature-dependent  nonlinearities  [45].  Similarly,  Volterra 
filters  have  been  developed  to  reduce  the  distortion  in  bandpass  filters  for  radio  re- 
ceivers, where  nonlinear  effects  can  render  a signal  unrecoverable  [46].  Distortion  in 
communications  systems  due  to  nonlinearities  in  high-power  amplifiers  has  also  been 
alleviated  through  the  use  of  Volterra  filters  [47].  In  addition,  Volterra  series  have 
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been  used  to  achieve  nonlinear  echo  cancellation  in  telecommunications  systems  [48]. 
In  the  held  of  biomedical  engineering,  adaptive  Volterra  filters  have  been  used  to 
filter  stimulus  artifacts,  which  consist  of  stimulus-dependent,  time-varying,  nonlin- 
ear signals  that  interfere  with  the  measurement  of  somatosensory  evoked  potentials 
(SEPs)  [49].  SEPs  comprise  a class  of  signals  generated  by  the  central  and  peripheral 
nervous  systems  in  response  to  external  stimuli  and  are  important  for  the  diagnosis 
of  neuromuscular  disorders. 

Frequently,  the  ultimate  objective  of  Volterra  models  is  to  design  controllers  for 
nonlinear  systems.  In  contrast  to  some  of  the  previous  examples,  where  Volterra 
models  were  used  to  filter  nonlinear  effects,  in  these  cases  Volterra  models  are  used 
to  generate  input  signals  in  order  to  control  system  responses.  One  such  example 
is  active  noise  control  (ANC),  in  which  the  objective  is  to  cancel  the  effect  of  a pri- 
mary noise  source  through  destructive  interference  from  a controlled  secondary  source. 
Active  noise  control  has  been  applied  to  cancel  noise  in  such  systems  as  heating,  ven- 
tilating, and  air  conditioning  (HVAC)  systems,  headsets,  and  airplanes.  Volterra 
series  have  been  used  to  model  nonlinearities  in  primary  noise  sources  and  within 
the  control  systems  in  order  to  achieve  better  performance  of  ANC  systems  [50],  [51]. 
Doyle,  Pearson,  and  Ogunnaike  [52]  have  devoted  an  entire  text  to  the  develop- 
ment of  Volterra  models  for  process  control.  In  the  text,  as  well  as  in  a series  of 
papers  [53],  [54],  [55],  the  authors  have  outlined  nonlinear  model  predictive  control 
(NMPC)  strategies  based  on  Volterra  models  and  applied  them  to  various  chemical 
processes.  NMPC  is  a natural  extension  of  linear  model  predictive  control  (MPC), 
which  entails  the  optimization  of  an  objective  function  subject  to  constraints,  to  non- 
linear systems.  The  Volterra  approach  allows  the  nonlinear  terms  to  be  incorporated 
as  correction  loops  in  the  NMPC  system.  The  controller  consists  of  these  nonlinear 
correction  loops  in  addition  to  an  optimal  linear  controller.  As  another  example, 
Volterra  series  have  been  used  to  model  the  equations  of  motion  of  a pendulum  in 
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order  to  achieve  tracking  control  via  feedback  linearization  [56] . Volterra  models  have 
also  been  employed  in  the  design  of  exact  model  matching  control  systems  that  are 
capable  of  matching  a reference  model  [57].  There  are  countless  other  examples.  At 
this  point,  it  should  be  clear  that  Volterra  series  representations  are  applicable  to  a 
large  number  of  diverse  systems  from  virtually  every  engineering  field. 

In  applying  Volterra  series  representations  to  generate  models  for  dynamical 
systems,  the  problem  reduces  to  one  of  identifying  the  Volterra  kernels  of  a given 
system.  Once  the  kernels  are  known,  the  system  response  to  any  arbitrary  input 
can  be  computed  from  the  Volterra  operators.  Kernel  identification  is  not  an  easy 
task  due  to  the  large  dimensions  of  the  higher-order  kernels  and  the  fact  that  the 
identification  problem  is  ill-posed.  Many  different  approaches  to  kernel  identification 
have  been  taken  in  the  literature.  It  should  be  noted  that,  in  theory,  any  ODE 
can  be  converted  into  an  integral  equation  and  solved  for  its  Volterra  kernels  [12], 
[15].  This  is  demonstrated  in  Chapter  2.  Unfortunately,  this  approach  is  usually 
not  practical  due  to  high  computational  costs.  In  addition,  it  requires  that  a given 
system  be  modeled  by  an  ODE,  which  is  a serious  limitation.  In  theory,  Volterra 
kernels  can  be  measured  by  applying  impulses  to  a continuous-time  system.  Silva 
used  this  approach  for  discrete-time  systems  by  applying  unit  sample  inputs,  the 
discrete  analog  to  impulse  inputs,  to  certain  computational  fluid  dynamics  (CFD) 
models  [27,28].  This  technique  is  not  directly  applicable  to  experimental  systems, 
however.  Methods  for  identifying  Volterra  kernels  from  experimental  data  include 
a white  noise/cross  correlation  method  developed  by  Lee  and  Schetzen  [58]  and  a 
harmonic  probing  technique  employed  by  Boyd  et  al.  [59].  Both  these  approaches 
are  limited,  however,  in  that  they  require  specific  input  excitations  which  are  not 
always  physically  realizable.  Another  common  approach  to  kernel  identification  is 
to  approximate  the  kernels  in  terms  of  a set  of  basis  functions.  Marmarelis  [60] 
estimated  Volterra  kernels  for  biological  systems  in  terms  of  Laguerre  polynomials 
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and  Reisenthel  [34]  used  decaying  exponentials  to  approximate  the  first  and  second- 
order  kernels  of  nonlinear  aeroelastic  systems.  These  approaches  allow  for  physically 
realizable  inputs  and  are,  therefore,  suitable  for  experimental  systems.  Nikolaou 
and  Mantha  [61]  applied  biorthogonal  wavelets  to  compress  the  first  and  second- 
order  kernels  that  model  a particular  chemical  reaction.  Kurdila  et  al.  [62]  also  used 
biorthogonal  wavelets  for  the  approximation  of  Volterra  kernels  for  an  experimental 
aeroelastic  system.  Neural  networks  have  also  been  used  to  estimate  Volterra  kernels, 
and  Wray  and  Green  [63]  have  demonstrated  that  many  artificial  neural  networks  have 
an  equivalent  infinite  Volterra  series  representation.  The  kernel  identification  problem 
is  discussed  in  more  detail  in  Chapter  6. 

1.2  Wavelets  and  Multiresolution  Analysis 

The  field  of  wavelets  and  multiresolution  analysis  has  had  a profound  impact 
on  a number  of  disciplines,  including  mathematics,  physics,  geosciences,  meteorology, 
medical  imagery,  and  engineering.  Although  they  initially  served  predominantly  as  a 
signal  processing  tool,  wavelets  are  currently  employed  in  such  diverse  applications  as 
operator  theory,  approximation  theory,  reduced-order  modeling  of  dynamical  systems, 
particle  image  velocimetry  (PIV),  and  the  solution  of  partial  differential  equations 
(PDEs).  This  diversity  can  be  attributed  to  the  fact  that  there  is  a great  deal  of  flexi- 
bility in  wavelet  design.  Wavelet  bases  can  be  developed  that  have  varying  properties 
such  as  orthogonality,  smoothness,  approximation  order,  and  compact  support.  As 
wavelets  continue  to  evolve,  the  list  of  potential  applications  will  continue  to  grow. 

The  term  “wavelet”  was  coined  by  Meyer,  Morlet,  and  Grossman  in  1982.  As 
the  name  implies,  wavelets  resemble  small  waves  in  that  they  have  the  oscillatory 
character  of  waves  and  most  practical  examples  are  compactly  supported.  For  this 
reason,  these  functions  were  called  “ondelettes,”  which  translates  into  English  as 
“wavelets.”  Generally  speaking,  a wavelet  can  be  defined  as  any  function  tjj  G L2( R) 
whose  family  of  dilates  and  translates  = 2i'2i\){2?x  — k)  forms  a basis  for 
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L2( R),  the  Hilbert  space  of  square-integrable  functions.  Hence,  a wavelet  expansion 
of  a function  or  signal  affords  localization  in  both  the  time  and  frequency  domains. 
Wavelets  are  constructed  from  the  scaled  translates  and  dilates  of  a single  scaling 
function,  or  generator.  As  will  be  seen  later,  this  generalizes  to  multiwavelets,  which 
are  constructed  from  multiple  generators.  Wavelets  are  unique  from  many  other 
functions  in  that  they  are  not  obtained  as  the  solutions  of  differential  equations,  but 
rather  as  solutions  of  the  two-scale  relationship. 

The  field  of  wavelets  and  multiresolution  analysis  began  to  take  root  in  the  early 
1980’s  as  the  synthesis  of  ideas  from  several  disparate  disciplines,  including  physics, 
mathematics,  engineering,  and  signal  processing.  Much  of  the  earliest  work  in  the  field 
was  motivated  by  some  of  the  shortcomings  of  classical  Fourier  analysis  in  which  a 
function  or  signal  is  decomposed  in  terms  of  sinusoids  of  varying  frequency.  In  general, 
Fourier  methods  are  well  suited  for  the  analysis  of  smooth  signals  whose  frequency 
content  does  not  change  with  time.  However,  they  are  less  than  ideal  for  the  analysis 
of  signals  that  exhibit  sharp  features  or  have  frequencies  that  evolve  with  time.  This 
is  because  Fourier  methods  employ  global  basis  functions  and,  therefore,  cannot  easily 
yield  information  in  the  time  domain.  This  limitation  led  to  the  emergence  of  time- 
frequency  analysis  from  which  wavelets  are  descended.  Some  of  the  earliest  efforts  in 
this  area  were  put  forth  by  Wigner  [64]  in  1932  in  his  research  in  quantum  mechanics 
and  Gabor  [65]  in  1946  in  his  work  towards  the  identification  of  coherent  features  in 
sound.  In  a similar  manner,  advancements  in  the  field  of  signal  processing  formed  part 
of  the  foundation  for  wavelets  and  multiresolution  analysis.  In  the  1970’s,  the  use  of 
banks  of  parallel  filters,  usually  in  the  form  of  pairs  of  high  and  low  pass  filters  with 
complementary  properties,  became  an  important  topic.  During  this  time,  Croiser  [66] 
and  Crochiere  [67]  made  significant  contributions  toward  the  development  of  subband 
coding  and  its  realization  using  quadrature  mirror  filter  banks.  This  field  has  made 
a precise  study  of  how  a signal  can  be  analyzed  by  decomposing  it  into  constituent 
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frequency  bands.  Smith  and  Barnwell  [68]  developed  conditions  whereby  the  original 
signal  can  be  perfectly  reconstructed  using  certain  types  of  quadrature  mirror  filter 
banks.  It  will  be  shown  later  that,  for  orthonormal  wavelets,  the  wavelet  transform 
can  be  viewed  as  the  implementation  of  a filter  bank  with  perfect  reconstruction 
qualities. 

From  the  preceding  discussion,  it  should  be  clear  that  the  foundation  for  wavelets 
and  multiresolution  analysis  was  laid  far  before  the  held  gained  recognition  as  a 
distinct  discipline  in  the  1980’s.  Indeed,  it  is  now  widely  recognized  that  the  first 
wavelet,  the  Haar  function,  was  constructed  in  1910.  This  function  has  perhaps  the 
simplest  form  of  any  wavelet  in  existence.  Although  its  utility  is  limited  in  most 
applications  because  it  is  a discontinuous  function  with  low  approximation  order,  the 
Haar  function  serves  as  an  excellent  prototype  wavelet  because  many  of  the  properties 
inherent  to  wavelets  are  easy  to  visualize  for  this  particular  function.  In  addition,  the 
Haar  wavelet  has  several  attractive  features  in  that  it  can  be  expressed  in  closed  form, 
yields  an  orthonormal  basis,  is  compactly  supported,  and  is  efficient  to  implement. 
While  the  Haar  function  possesses  simplicity  of  form  in  the  time  (or  spatial)  domain, 
the  sine,  or  Littlewood-Paley,  wavelet  takes  the  form  of  a characteristic  function  of 
an  interval  in  the  frequency  domain.  Therefore,  it  acts  as  an  ideal  bandpass  filter 
and  the  resulting  wavelet  decomposition  has  a particularly  simple  interpretation. 
Unfortunately,  the  sine  scaling  function  and  wavelet  are  not  compactly  supported, 
which  is  a significant  limitation  in  applications.  Hence,  the  sine  wavelet  is  of  interest 
only  from  an  academic  viewpoint. 

In  contrast  to  the  Haar  and  sine  wavelets,  most  developments  in  the  field  of 
wavelets  and  multiresolution  analysis  center  around  the  derivation  of  families  of 
wavelets  in  which  the  members  have  specific  properties  that  vary  within  the  fam- 
ily. These  properties  may  include  smoothness,  approximation  order,  and  length  of 
support.  For  example,  in  1986,  Meyer  [69]  generated  a family  of  wavelets  whose 
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members  are  in  Ck , the  set  of  continuous  functions  with  k continuous  derivatives, 
for  k arbitrary  but  finite.  These  wavelets  lack  some  desirable  characteristics,  how- 
ever, such  as  compact  support,  orthonormality,  and  closed  form  representations.  In 
1988,  Daubechies  [70]  constructed  a family  of  wavelets  that  are  compactly  supported, 
orthonormal,  and  whose  functions  vary  in  Ck  for  k arbitrary  but  finite.  In  other 
words,  Daubechies  wavelets  combine  most  of  the  desirable  characteristics  of  the  Haar 
function  and  Meyer  wavelets.  This  was  a revolutionary  development  that  led  to  the 
popularization  of  wavelets  for  use  in  the  areas  of  signal  processing,  statistics,  and 
numerical  analysis.  However,  Daubechies  wavelets  do  not  have  closed  form  repre- 
sentations and  do  not  exhibit  good  symmetry  or  antisymmetry  properties.  In  1991, 
Beylkin,  Coifman,  and  Rokhlin  [1]  derived  a family  of  wavelets  that  exhibit  similar 
characteristics  in  that  they  are  orthogonal,  compactly  supported,  and  possess  good 
smoothness  and  approximation  properties.  These  wavelets  have  better  symmetry 
properties  than  Daubechies  wavelets.  The  symmetry  properties  were  achieved  via 
a tradeoff  that  resulted  in  wavelets  of  somewhat  larger  support.  In  1992,  Cohen, 
Daubechies,  and  Feauveau  [71]  generated  a family  of  biorthogonal  wavelets  that  has 
become  the  most  frequently  used  wavelet  family  for  applications  to  denoising  and 
compression.  By  trading  orthogonality  for  biorthogonality,  they  were  able  to  con- 
struct wavelets  with  better  symmetry  and  antisymmetry  characteristics  than  those 
derived  by  Beylkin,  Coifman,  and  Rokhlin  while  retaining  compact  support  and  good 
smoothness  and  approximation  properties. 

The  design  of  wavelet  families  always  involves  tradeoffs  in  which  one  particular 
desired  property  is  obtained  at  the  expense  of  another.  In  the  mid  1990’s,  the  intro- 
duction of  multiwavelets,  wavelets  that  are  generated  from  multiple  scaling  functions, 
increased  the  level  of  available  flexibility  in  wavelet  design.  Because  multiwavelet  con- 
structions employ  a number  of  scaling  functions  and  wavelets,  the  resulting  functions 
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can  exhibit  a wider  array  of  properties.  This  does  not  come  without  a cost,  how- 
ever, since  multiwavelets  are  typically  more  costly  to  implement  than  their  single- 
generator counterparts.  In  1994,  Donovan,  Geronimo,  Hardin,  and  Massopust  [72,73] 
constructed  a class  of  multiwavelets  that  are  compactly  supported,  orthonormal,  and 
have  good  smoothness  and  approximation  properties.  These  wavelets  do  not  have 
closed  form  expressions  but  rather  are  fractal  in  nature.  Later,  in  1996,  they  gen- 
erated a family  of  piecewise-polynomial  multiwavelets  having  the  same  properties  as 
well  as  closed  form  representations  [2,3].  These  multiwavelets  were  derived  from  the 
classical  finite  element  basis  functions  via  the  technique  of  intertwining  multireso- 
lution analyses.  By  using  the  finite  element  functions  of  appropriate  order,  multi- 
wavelets of  arbitrary  approximation  order  can  be  constructed.  Donovan,  Geronimo, 
and  Hardin  [2]  carried  out  this  derivation  to  generate  piecewise-linear  multiwavelets. 
In  this  dissertation,  their  method  is  employed  to  construct  piecewise-quadratic  and 
piecewise-cubic  multi  wavelets.  Some  of  the  wavelet  bases  mentioned  in  the  preceding 
discussion  are  shown  in  Figure  (1.1). 

Wavelets  have  had  a profound  impact  on  a number  of  different  disciplines,  includ- 
ing mathematics,  physics,  geoscience,  meteorology,  medical  imagery,  and  engineering. 
There  are  a large  number  of  wavelet  texts  that  are  mathematical  in  nature,  such  as  the 
works  by  Daubechies  [74],  Ogden  [75],  Chui  [76],  [77],  and  Wojtaszczyk  [78].  Mean- 
while, a great  deal  of  attention  has  been  directed  towards  wavelet  applications.  One 
of  the  earliest  applications  of  wavelets  was  by  Morlet  et  al.  [79]  in  the  study  of  seismic 
phenomena  encountered  in  the  course  of  oil  prospecting  expeditions.  As  of  the  early 
1980’s,  wavelets  starting  becoming  popular  in  signal  processing  applications  such  as 
denoising,  data  compression,  and  two-dimensional  image  compression.  Wavelets  were 
seen  as  an  attractive  alternative  to  the  traditional  Fourier  series  for  the  analysis  of 
signals  that  exhibit  sharp  features  or  transient  behavior.  As  will  be  discussed  later, 
this  was  due  in  large  part  to  the  fact  that  wavelets  afford  the  localization  of  signals  in 
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1910,  Haar  Wavelet,  orthonormal,  compactly  supported 
low  approximation  order,  poor  smoothness 


1988,  Daubechies  family  of  wavelets,  orthonormal,  compactly  supported 

good  approximation  order  and  smoothness,  no  symmetry  or  anti-symmetry 


1992,  Cohen-Daubechies-Feauveau  family,  biorthogonal,  compactly  supported 
good  approximation  order  and  smoothness,  symmetry  or  anti-symmetry 


1999,  Hardin-Kurdila-Prazenica,  orthonormal,  compactly  supported 

good  approximation  order  and  smoothness,  symmetry  or  anti-symmetry 


Figure  1.1:  The  Evolution  of  Wavelet  Bases,  1910-2000 


both  the  time  and  frequency  domains.  The  development  of  fast  algorithms  for  com- 
puting the  discrete  wavelet  transform  by  Mallat  [80]  and  Meyer  [69]  was  instrumental 
in  the  emergence  of  wavelets  as  a practical  signal  processing  tool.  There  are  many 
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texts  devoted  to  the  use  of  wavelets  in  signal  processing,  including  books  by  Mal- 
lat  [81],  Strang  and  Nguyen  [82],  Vetterli  and  Kovacevic  [83],  and  Burros,  Gopinath, 
and  Guo  [84], 

As  the  field  of  wavelets  and  multiresolution  analysis  continued  to  grow  in  the 
late  1980’s,  the  use  of  wavelets  expanded  to  a diverse  array  of  applications.  Wavelets 
have  been  implemented  in  the  solution  of  partial  differential  equations  (PDEs)  over 
smooth  manifolds  [85].  Similarly,  they  have  been  used  to  solve  elliptic  boundary- value 
problems  [86]  and  wavelet-based  multilevel  methods  have  been  derived  for  the  solu- 
tion of  elliptic,  parabolic,  and  hyperbolic  PDEs  [87].  Donoho  [88]  has  demonstrated 
that  wavelet  systems  are  nearly  optimal  for  a wide  class  of  parametric  estimation 
problems.  In  the  area  of  fluid  dynamics,  wavelet  bases  have  been  employed  in  the 
analytical  study  of  the  Navier-Stokes  equations  and  in  a qualitative  study  of  tur- 
bulence [89].  Wavelets  have  also  been  applied  in  the  post-processing  of  flow  fields 
from  particle  image  velocimetry  (PIV)  [90],  and  divergence-free  wavelet  bases  have 
been  used  to  obtain  reduced-order  representations  for  flow  control  simulations  [91]. 
In  the  field  of  biomedical  engineering,  wavelets  have  been  employed  to  remove  arti- 
facts and  noise  from  measured  electroencephalogram  (EEG)  signals  [92].  Wavelets 
have  also  been  used  for  fault  detection  in  mechanical  systems  [93]  and  in  the  charac- 
terization of  nonlinearities  in  aeroelastic  systems  [94],  Also  in  the  field  of  aeroelas- 
ticity,  wavelets  have  been  used  in  the  estimation  of  damping  parameters  for  flutter 
analysis  [95].  As  a final  example,  of  most  relevance  to  this  dissertation,  Beylkin, 
Coifman,  and  Rokhlin  [1]  applied  wavelets  for  the  compression  of  various  integral  op- 
erators. Nikolaou  and  Mantha  [61]  used  biorthogonal  wavelets  to  compress  the  first 
and  second-order  Volterra  kernels  corresponding  to  various  chemical  processes.  Kur- 
dila  et  al.  [62]  derived  reduced-order  representations  of  Volterra  kernels  in  terms  of 
biorthogonal  wavelets.  This  work,  as  well  as  the  research  detailed  in  this  dissertation, 
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was  motivated  in  large  part  by  the  effectiveness  of  wavelets  in  compressing  integral 
operators  [1],  as  well  as  the  work  of  Nikolaou  and  Mantha  [61]. 


CHAPTER  2 

FUNDAMENTALS  OF  VOLTERRA  SERIES 


Volterra  series  representations  comprise  one  of  the  many  tools  available  for  the 
analysis  of  nonlinear  systems.  Volterra  series  can  be  applied  to  either  of  the  two 
major  aspects  of  nonlinear  system  theory,  analysis  and  synthesis.  In  analysis,  the 
goal  is  to  determine  the  system  operator  that  maps  the  input  into  the  output  while, 
in  synthesis,  a system  with  a given  operator  is  constructed  from  a set  of  elemental 
systems.  The  focus  of  this  dissertation  is  on  analysis,  which  entails  the  identification 
of  the  kernels  that  appear  in  Volterra  series  representations  of  dynamical  systems. 
As  will  be  seen,  a given  dynamical  system  is  completely  characterized  in  terms  of 
its  Volterra  kernels.  In  this  chapter,  some  fundamental  properties  of  Volterra  series 
are  reviewed.  In  Section  2.1,  various  forms  for  the  Volterra  operators  under  the  con- 
ditions of  time-invariance,  causality,  and  one-sided  inputs  are  examined.  Then,  in 
Section  2.2,  the  first-order  Volterra  operator  is  discussed  in  detail.  The  first-order 
operator  is  sufficient  to  characterize  the  dynamics  of  linear  systems  and  is  directly 
related  to  classical  methods  for  the  analysis  of  linear,  time-invariant  systems  such 
as  convolution  and,  in  the  frequency  domain,  transfer  functions.  In  Section  2.3,  the 
higher-order  Volterra  operators  that  are  needed  to  describe  nonlinear  system  dynam- 
ics are  discussed.  Because  the  scope  of  this  dissertation  is  limited  to  the  identification 
of  first  and  second-order  Volterra  kernels  for  nonlinear  systems,  the  focus  is  on  the 
second  order-operator.  Various  forms  for  the  higher-order  kernels,  namely  the  sym- 
metric and  triangular  forms,  are  also  discussed.  Section  2.4  describes  the  relationship 
between  ordinary  differential  equations  (ODEs)  and  Volterra  series  representations. 
An  ODE  can  be  converted  into  a Volterra  integral  equation  through  the  process  of 
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integration.  Using  the  method  of  successive  substitutions,  also  known  as  Picard  itera- 
tion, a solution  to  the  resulting  integral  equation  can  be  derived  in  terms  of  a Volterra 
series.  This  procedure  is  demonstrated  for  both  linear  and  bilinear  state  space  sys- 
tems, and  generalizes  to  all  ODEs.  Finally,  in  Section  2.5,  some  of  the  convergence, 
existence,  and  uniqueness  issues  pertaining  to  Volterra  series  representations  are  dis- 
cussed. Much  of  the  information  presented  in  this  chapter  has  been  taken  from  the 
texts  by  Rugh  [15]  and  Schetzen  [16].  The  interested  reader  can  refer  to  these  texts 
for  a more  exhaustive  discussion  of  Volterra  theory. 

2.1  General  Concepts 

A system  can  be  viewed  as  an  operator  H that  maps  an  input  u into  an  output 

y- 


V(t)  = H[u{t),t]  (2.1) 

If  the  system  is  time-invariant,  the  operator  is  no  longer  an  explicit  function  of  time 
and  can  be  written  as  H[u(t)].  In  this  discussion,  time-invariance  is  assumed  unless 
otherwise  stated.  The  output  of  a nonlinear  dynamical  system  can  be  expressed  in 
terms  of  a Volterra  series,  which  is  an  infinite  series  of  integral  operators 


oo  oo 

y(t)  = '52yn(t)  = '^2Hn[u(t)]  (2.2) 

n= 1 n= 1 

In  equation  (2.2),  yn(t)  is  the  output  of  the  nth-order  Volterra  operator  Hn  that,  for 
the  time-invariant  case,  takes  the  form 


/oo  roo 

■■■  / hn  (<7i, . . . , an)  u (t  - cq)  • • • u (t  - an)  daY  ■ ■ ■ dan  (2.3) 

-oo  J — OO 

where  hn  is  termed  the  nth-order  Volterra  kernel  of  the  system.  The  form  of  the 
nth-order  Volterra  operator  shown  in  equation  (2.3)  is  sometimes  referred  to  as  a 
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generalized  convolution  integral.  It  should  be  noted  that  a zero-order  term  yo  can 
be  added  in  equation  (2.2)  to  represent  systems  that  have  nonzero  responses  to  an 
identically  zero  input.  For  example,  the  zero-order  operator  characterizes  part  of 
the  transient  response  of  a system  to  nonzero  initial  conditions.  In  this  dissertation, 
most  of  the  examples  considered  have  initial  conditions  of  zero  so  that  the  zero-order 
operator  vanishes.  In  addition  to  being  a series  of  operators  which  transforms  the 
input  function  into  the  output  function,  a Volterra  series  representation  can  also  be 
viewed  as  a functional  series.  If  one  is  interested  in  the  output  at  a particular  instant 
in  time  t1(  the  Volterra  series  acts  as  a functional,  mapping  the  input  function  u into 
the  output  at  time  t\,  y(t\). 

Clearly,  a system  is  characterized  by  its  Volterra  kernels  since,  once  the  kernels 
of  a particular  system  are  known,  the  response  to  any  input  can  be  predicted  using 
equations  (2.2)  and  (2.3).  It  should  be  noted  that  the  kernels  that  appear  in  equation 
(2.3)  are  not  necessarily  unique  for  a given  dynamical  system.  Indeed,  except  for  the 
first-order  kernel,  the  nth-order  Volterra  kernel  of  a nonlinear  system  can  be  expressed 
in  more  than  one  form.  However,  as  will  be  shown  in  Section  2.3,  a symmetric  form  for 
the  nth-order  Volterra  kernel  can  always  be  derived  that  is  unique  for  a given  system. 
The  output  of  certain  systems  can  be  expressed  in  terms  of  a finite  number  of  Volterra 
operators  in  equation  (2.2).  In  some  cases,  a system  is  completely  characterized  by 
only  one  Volterra  kernel.  For  example,  a linear  system  can  be  represented  in  terms  of 
the  first-order  Volterra  kernel  alone,  since,  as  will  be  discussed  later,  the  first-order 
Volterra  operator  is  a linear  operator.  The  higher-order  Volterra  operators,  which  are 
nonlinear,  do  not  contribute  to  the  response  of  a linear  system.  A system  that  can 
be  characterized  in  terms  of  the  nth-order  Volterra  kernel  alone  is  known  as  a degree 
n homogeneous  system  [15].  The  output  of  such  a system  is  given  by 
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/oo  roo 

• • • / hn  (cri, . . . , an)  u (t  - cri)  ■■■  u (t  - an)  dox  ■■  ■ dan  (2.4) 

-oo  J — OO 

The  term  degree  n homogeneous  system  refers  to  the  fact  that,  given  an  input  of  the 
form  cm,  where  a is  a scalar,  the  output  of  the  system  is  any , where  y is  the  response 
of  the  system  to  the  input  u.  This  property  is  readily  apparent  from  equation  (2.4). 
Some  systems  can  be  described  in  terms  of  a finite  sum  of  Volterra  operators 

N poo  poo 

y{t)  = ^>2  / • • • / K (ffi,  ■ ■ • , <rn)  U (t  - <?i)  • • • « (t  - °n)  do\  ■ • • dan  (2.5) 

Such  systems  are  sometimes  referred  to  as  polynomial  systems  of  degree  N,  where 
h,N  is  the  highest-order  nonzero  kernel  needed  to  characterize  the  system  dynamics. 

The  output  of  the  nth-order  Volterra  operator,  given  by  equation  (2.3),  can  be 
expressed  in  an  alternate  form  by  performing  a change  of  variables: 

T\  = t — 0\  dr\  — —d&\ 

Tji  t & 7i  d.Tn  d(T yi 

The  limits  of  integration  for  each  integral  become  T\  : oo  — ► — oo,  ...  , rn  : oo  — > — oo. 
Substituting  into  equation  (2.5),  we  obtain 


/*— oo  r— oo 

yn(t)=  / ■■■  hn(t  — Ti, . . . ,t  — rn)  u (ti)  ■■  - u (rn)  (-dn)  ■ ■ ■ (- drn ) 

J oo  J oo 

Noting  that  the  negative  sign  in  each  integral  changes  the  order  of  the  integration 
limits,  we  have 


Vn  {t) 


dn  T\ , . . 


. , t - Tn)  u (n)  • • • u (rn)  dri  ■ ■ ■ drn 


(2.6) 
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Changing  variables  once  again  in  equation  (2.6),  it  is  clear  that  we  have  the  following 
equivalent  expressions  for  the  output  of  the  nth-order  Volterra  operator  for  a time- 
invariant  system: 


/OO  /*  oo 

•••/  hn(ol,...,Gn)u{t-Gi)---u(t-an)dal---don  (2.7) 

-oo  J — OO 


/oo  r oo 

•••/  hn(t-au...,t-an)u{a1)---u(an)da1---dan  (2.8) 

-oo  J —oo 

It  is  not  difficult  to  demonstrate  that  the  Volterra  operators  in  equations  (2.7) 
and  (2.8)  are  time-invariant,  or  stationary,  operators.  A time-invariant  operator  does 
not  vary  with  time  so  that  a time  shift  of  the  input  results  in  an  equivalent  time  shift 
of  the  output.  Therefore,  an  operator  H defined  as 


y{t)  — H [u  (t)] 

is  time-invariant,  or  stationary,  if 


y (t  - r)  = H[u(t  - r)]  (2.9) 

Checking  the  expression  for  the  nth-order  Volterra  operator  in  equation  (2.7)  for 
time-invariance,  the  time-shifted  output  yn(t  — r)  is  given  by 


/oo  roo 

■■■  / hn  (<Xi, . . . , an)  u (t  - r - ax)  ■ ■ ■ u (t  - r - an)  dax  ■ ■ ■ dan 

OO  'J  — OO 

(2.10) 

The  operator  Hn  acting  on  the  time-shifted  input  u(t  — r)  yields 
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/oo  roc 

■ ■ ■ / hn  (cti,  . . . , an)  u (t  - t - ox)  ■ • -u  (t  - r - an)  dox  • • • dan 

-oo  J — OO 

(2.11) 

Clearly,  then,  comparing  equations  (2.10)  and  (2.11), 


2/n  (*  - t)  = [u  (t  - t)]  (2-12) 

and  it  follows  that  Hn  is  a time-invariant  operator.  Nonstationary  systems  can  also 
be  represented  using  Volterra  series.  In  this  case,  the  expression  for  the  nth-order 
Volterra  operator  in  equation  (2.8)  takes  the  form 


/oo  r oo 

•••/  hn(t,cri, . . . , an)  u(cri)  • • - u (on)  dcri  ■ • ■ dan  (2.13) 

-oo  J — OO 

where  hn(t,ai, . . . ,an)  is  a nonstationary  kernel.  It  follows  that  for  a given  kernel 
hn(t,  a i, . . . , on)  to  be  stationary,  there  must  exist  a kernel  gn{ti,  ■ ■ • , tn)  such  that 

hn{t,oi,  - ■ ■ ,on)  = gn(t  — o\,  - • • ,t  — on)  (2.14) 

where  t\  = t — cri,  ...  ,tn  = t — on.  Rugh  [15]  has  demonstrated  that  a kernel 
hn(t,  Oi, , on)  can  be  checked  for  stationarity  by  time-shifting  its  arguments  by  t. 
If  the  relationship 


(t,  G\ , ■ • • , on)  hn  (0,(7!  t , • • • , on  t)  (2.15) 

is  satisfied,  then  the  kernel  is  stationary  and  gn(ti, ... , tn)  is  obtained  as 


dn  (^1)  ‘ ) hi)  hn  (0,  fi,  ' j tn) 


(2.16) 
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Hence,  equation  (2.15)  is  a convenient  means  for  testing  a given  kernel  for  time- 
invariance  [15].  If  this  condition  is  satisfied,  equation  (2.16)  gives  the  stationary  form 
of  the  kernel  that  would  appear  in  equation  (2.8). 

Frequently,  dynamical  systems  are  assumed  to  be  causal,  which  means  that  the 
value  of  the  present  output  is  not  dependent  on  future  inputs.  Intuitively,  it  is 
apparent  that  all  physical  systems  are  causal.  Recall  that  the  output  of  the  nth- 
order  Volterra  operator  for  a time-invariant  dynamical  system  can  be  written  in  the 
form 


/oo  roo 

■■■  / hn  (ffi, . . . , on)  u (t  - cq)  • • • u (t  - an)  dax  • • • dcrn  (2.17) 

-oo  J — OO 

It  is  clear  from  equation  (2.17)  that,  given  the  present  time  t,  if  eq  < 0 for  any  value 
of  i = 1, ...  ,n,  then  the  corresponding  input  u(t  — cq)  represents  a future  input  at 
time  greater  than  t.  It  follows  that  in  order  for  causality  to  hold,  we  must  have 

hn  (cq,  • • • , <?n)  = o for  any  cq  < 0,  i=  (2.18) 

This  condition  reflects  the  fact  that  future  inputs  have  no  influence  on  the  present 
output.  Similarly,  if  the  time-invariant  Volterra  operator  is  in  the  alternate  form 


/oo  roo 

■■■  hn(t-a1,...,t-an)u(a1)---u(an)dal---dan  (2.19) 

-OO  J — OO 

it  is  clear  that,  given  the  present  time  t,  if  at  > t for  any  value  of  i = 1, . . . , n,  then 
the  corresponding  input  •u(cq)  represents  an  input  at  time  greater  than  t,.  Then,  it 
follows  that  causality  requires 


hn  (t  — cti  , • • • , t — an)  = 0 for  any  cr,>t,  i = 1 , . . . , n 


(2.20) 


26 


The  causality  conditions  in  equations  (2.18)  and  (2.20)  can  be  directly  incorporated 
into  the  corresponding  expressions  for  the  Volterra  operators  by  changing  the  limits 
of  integration.  Equation  (2.18)  implies  that,  for  a causal  system,  the  lower  limits  ol 
integration  in  equation  (2.17)  can  be  changed  to  zero: 


roc  roc 

Vn  (t)  = / •••  / hn{a1,...,an)u(t-a1)---u{t-an)dal---dan  (2.21) 

Jo  Jo 

since  hn  is  zero  if  any  cr*,  i = 1 ,...,n,  is  negative.  Alternatively,  the  causality 
condition  in  equation  (2.20)  implies  that  the  upper  integration  limits  on  each  o{  can 
be  reduced  to  t in  equation  (2.19): 


Vn{t) 


f -S' 


hn{t  - cri , . . . ,t  - an)u  (cri)  • ■ • u (an)  da i • ■ • dan 


(2.22) 


It  should  also  be  noted  that,  in  practice,  one-sided  inputs  are  usually  encountered 
[15].  One-sided  inputs  are  usually  characterized  by  u(t)  = 0 for  t < 0.  Then,  for 
causal,  time-invariant  systems  with  one-sided  inputs,  the  upper  limits  of  integration 
in  equation  (2.21)  can  be  reduced  to  t , while  the  lower  limits  of  integration  in  equation 
(2.22)  can  be  changed  to  0.  Under  these  conditions,  the  nth-order  Volterra  operator 
takes  either  of  the  following  forms: 


(t)  = [ ■■■  I hn  (c\,  • • • i an)  u(t  G\)  • • - u(t  crn)  da\  • • • dan 
Jo  Jo 


yn{t)=  I ■■■  [ hn(t-al,...,t-an)u(al)---u  (ct„)  dax  ■ ■ ■ dan 
Jo  Jo 


(2.23) 


(2.24) 


Therefore,  it  is  apparent  that  under  certain  conditions,  the  limits  of  integration  in 
the  various  expressions  for  the  Volterra  operators  can  take  on  a number  of  different 
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values.  In  such  cases,  the  integration  limits  that  are  most  convenient  are  usually 
chosen. 

As  a final  note,  the  results  given  in  this  section  are  applicable  to  single-input/single- 
output  (SISO)  systems.  It  is  not  difficult,  however,  to  extend  these  results  to  multiple- 
input/multiple-output  (MIMO)  systems.  For  multiple  outputs,  each  output  has  its 
own  Volterra  series  expansion  with  a unique  set  of  kernels.  In  the  case  where  there 
are  multiple  inputs,  a given  output  will  have  a Volterra  series  representation  consist- 
ing of  “self-kernels”  corresponding  to  each  individual  input  and  “cross-kernels  ’ that 
involve  multiple  input  functions  [96].  As  an  example,  a second-order  Volterra  series 
representation  of  a system  with  two  inputs,  u and  v , and  a single  output  y is  given 


The  output  of  the  first-order  Volterra  operator  for  a time-invariant  dynamical 
system  can  be  written  in  either  of  the  following  forms: 


by 


(2.25) 


where  h2iU,v  is  the  second-order  cross-kernel. 


2.2  The  First-Order  Volterra  Operator 


(2.26) 


— OO 


■oo 


h\  (t  — a)u  (a)  da 


(2.27) 
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The  equivalence  of  these  two  forms  was  shown  in  Section  2.1  for  the  general  nth-order 
operator.  Recall  that  if  the  system  is  causal  with  one-sided  inputs,  the  integration 
limits  in  both  expressions  can  be  reduced  to  0 to  t. 

It  is  straightforward  to  show  that  the  first-order  Volterra  operator  is  a linear 
operator.  A linear  operator  L is  defined  as  an  operator 


for  which  a linear  combination  of  inputs  results  in  the  same  linear  combination  of  the 
corresponding  outputs.  In  other  words,  the  principle  of  superposition  holds  for  linear 
operators.  Mathematically,  given  an  input  of  the  form 


where  a and  /3  are  scalars,  the  output  y of  a linear  operator  can  be  written  as 


To  demonstrate  that  the  first-order  Volterra  operator  is  a linear  operator,  note  that, 
given  an  input  of  the  form  in  equation  (2.28),  the  output  of  the  first-order  operator 
in  equation  (2.27)  is 


y{t)  = L [u  (£)] 


u(t)  = a v (t)  + (3  w ( t ) 


(2.28) 


y (t)  = L[a  v (t)  + P w (£)]  — a L[v  (£)]  + (3  L[w  (£)]  (2.29) 


Vi  (t)  = H,  [u  (£)]  - da 


--  h\(t  — a)w  (a)  da 


Therefore,  we  have 


yi  ( t ) = Hi  [a  v (t)  + (3  w (f)]  = a H1[v  (£)]  + (3  Hi  [w  (£)]  (2.30) 


Similarly,  given  the  input  in  equation  (2.28),  note  that 
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u (t  — a)  = a v (t  — a)  + (3  w (t  — a) 

Then,  the  output  of  the  first-order  Volterra  operator  in  equation  (2.26)  takes  the  form 


Clearly,  then,  the  first-order  Volterra  operator  is  a linear  operator.  In  the  next  section, 
it  will  be  shown  that  the  higher-order  Volterra  operators  are  nonlinear  operators. 

The  output  of  a linear  system  can  be  completely  represented  in  terms  of  the 
first-order  Volterra  operator,  and  the  system  is  completely  characterized  by  the  cor- 
responding first-order  kernel.  If  the  system  is  also  time-invariant,  it  is  classified  as  a 
linear  time-invariant  (LTI)  system.  It  is  a well-known  classical  result  that  the  output 
of  an  LTI  system  can  be  obtained  through  the  convolution  of  the  input  function  with 
the  impulse  response  of  the  system,  where  the  impulse  response  is  simply  the  response 
of  the  system  to  a unit  impulse  applied  at  t = 0.  Then,  the  output  of  an  LTI  system 
is  given  by  the  convolution  integral 


where  hx  is  the  impulse  response  of  the  system.  Clearly,  the  convolution  integral  in 
equation  (2.32)  is  of  the  same  form  as  the  expressions  for  the  output  of  the  first-order 
Volterra  operator  in  equations  (2.26)  and  (2.27).  Therefore,  the  first-order  Volterra 
kernel  h\  of  an  LTI  system  is  equivalent  to  the  impulse  response  of  the  system. 


2/i  (f)  = Hx  [u  (i)]  = -a)}  da 


— , h\  (a)  w [t  — a)  da 


and  we  obtain  the  same  result  as  in  equation  (2.30): 


yx  (t)  = Hx[a  v (t)  + (3  w (t)]  = a Hx  [u  (£)]  + (3  Hx  [w  (£)]  (2-31) 


y (t)  = / hx  {a)  u (t  - a)  da  = / hx  (t  - a)  u (a)  da  (2.32) 


— OO  J — oo 
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Often,  it  is  advantageous  to  analyze  a system  in  the  frequency  domain.  Towards 
this  end,  the  Fourier  transform  of  the  first-order  Volterra  kernel  is  defined  as 

/OO 

h\  (t)  e~juJtdt  (2.33) 

•OO 

where,  for  an  LTI  system,  Hi  (ju)  is  termed  the  frequency  response  function.  The 
impulse  response  function  can  be  obtained  from  the  frequency  response  function  using 
the  inverse  Fourier  transform: 

1 r°° 

hi(t)  = — Hi  (ju)  eJUJtdu  (2.34) 

"7T  J — oo 

The  convolution  operation  in  the  time  domain  is  equivalent  to  multiplication  in  the 
frequency  domain.  To  demonstrate  this  property,  note  that  the  Fourier  transform  of 
the  output  is 


/OO 

y(t)e-j“‘dt  (2.35) 

•oo 

Substituting  for  the  output  in  terms  of  the  convolution  integral,  we  have 

Y (ju)  — J {/  hi  (a)  u (t  - a)  chrj  e~jutdt 

Changing  variables  such  that  r = t — a and  rearranging  the  terms, 


/oo  poo 

/ hi  (cr)  u (r)  e~3uj{-T+a^ do  dr 

-oo  J —oo 

/oo  poo 

hi  (o)  e~juiado  / u (r)  e~ju,T  dr 

-OO  J —OO 

Finally,  comparing  the  terms  in  the  above  expression  to  equation  (2.33),  we  obtain 


Y (ju)  = Hi  (ju)  U (ju) 


(2.36) 
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Therefore,  from  equation  (2.36),  it  is  clear  that  convolution  in  the  time  domain  is 
equivalent  to  multiplication  in  the  frequency  domain. 

The  frequency  domain  representation  of  a system  is  especially  convenient  when 
the  input  is  periodic.  A function  u is  periodic  if  there  exists  a real  number  T such 
that  u(t  + T)  = u(t)  for  all  values  of  t.  As  noted  by  Schetzen  [16],  a periodic  input 
can  be  expanded  in  terms  of  a Fourier  series  as  follows: 

OO 

u(t)=  Y,  cnejnuJot  (2.37) 

71=  — OO 

where  u>o  is  the  fundamental  frequency  of  the  input  and  {cn}  are  complex  Fourier 
expansion  coefficients.  For  an  LTI  system,  the  output  takes  the  form 

OO  roo 

y(t)=Y,Cnj  (2.38) 

n=— oo 

where  the  principle  of  superposition  has  been  applied.  Note  that,  from  equation 
(2.33), 


/OO 

hi  (a)  e~jnuJoada  (2.39) 

-OO 

Then,  equation  (2.38)  can  be  written  as 

OO  OO 

y(t)=  J2  CnH^jnuj 0)e’nb*t=  £ dne^ot  (2.40) 

71=  — OO  71=  — OO 

Equation  (2.40)  shows  that,  for  a periodic  input,  the  output  of  an  LTI  system  is 
also  periodic  and  can  be  represented  in  terms  of  a Fourier  series  where  the  Fourier 
expansion  coefficients  {dn}  are  given  by 


dn  = Hi  (jnu)Q)  cn  (2.41) 

Equation  (2.40)  also  demonstrates  that,  for  an  LTI  system,  the  output  contains  only 
frequency  components  that  are  present  in  the  input.  This  is  not  true,  in  general, 
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for  nonlinear  systems.  Equation  (2.41)  shows  that  the  Fourier  expansion  coefficients 
of  the  output  are  the  input  coefficients  scaled  by  the  frequency  response  function  at 
each  frequency. 

ft  is  important  to  note  that  the  analysis  of  an  LTf  system  in  the  frequency  domain 
is  contingent  upon  the  existence  of  a finite  frequency  response  function  ft 

can  be  shown  that  a sufficient  condition  for  the  existence  of  the  frequency  response 
function  is  that  the  system  be  bounded-input,  bounded-output  (BIBO)  stable  [16]. 
An  LTf  system  is  BIBO  stable  provided  that,  given  an  input  that  is  bounded  such 
that  |u(f)|  < M for  all  t (where  M is  a finite,  real  number),  the  output  is  also 
bounded  for  all  t.  Note  that,  for  a bounded  input, 


\y(t)\  = 


r 

J — c 


h\  (a)  u(t  — a)  da 


poo 

< 

/ 1 hi 

J —oo 

poo 

(*)l  1 

< 

M / 

J —oo 

\hi  ( a 

(2.42) 


In  order  for  the  output  to  be  bounded,  we  must  have  \y  (t)\  < oo.  Clearly,  from 
equation  (2.42),  y will  be  bounded  if  the  following  condition  is  satisfied: 


\hi  (<r)|  da  < oo 


(2.43) 


Equation  (2.43)  is  a sufficient  condition  for  the  boundedness  of  y(t).  To  show  that 
equation  (2.43)  is  also  a necessary  condition,  Schetzen  [16]  noted  that  a “worst  case” 
input  can  be  defined  such  that 


u(t  — a)  = 


+ 1 


hi  (a)  > 0 


f —1  hi  (a)  < 0 

The  input  defined  in  equation  (2.44)  is  bounded  since  \u  (t  — 

for  this  input, 


(2.44) 

a)\  = 1 for  all  t.  Then, 
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\y(t)\  = 


f 

J — c 


h\  (cr)  u(t  — a)  da 


\hi  (cr) | da 


(2.45) 


In  order  for  y to  be  bounded,  the  inequality  in  equation  (2.43)  must  be  satisfied. 
Therefore,  equation  (2.43)  is  both  a necessary  and  sufficient  condition  for  a system 
to  be  BIBO  stable.  The  existence  of  a finite  frequency  response  function  for  an  LTI 
system  can  be  examined  by  first  noting  that,  from  Euler’s  identity, 


|e  J’wt|  = | cos  (tut)  — j sin  (cut) | = 1 (2.46) 

Then,  using  equation  (2.33),  a finite  frequency  response  function  Hi  (ju)  exists  if 


\Hi  Ml  = 


hi  (t)  e-jutdt 


< 


I hi  (t)| 


-jut  I 


dt  < oo 


(2.47) 


Therefore,  using  the  result  from  equation  (2.46),  a sufficient  condition  for  the  existence 
of  a finite  frequency  response  function  is 


/ \hi  (t)|  dt  < oo  (2.48) 

J — OO 

Equation  (2.48)  is  exactly  the  condition  that  must  be  satisfied  for  an  LTI  system 
to  be  BIBO  stable.  Therefore,  an  LTI  system  possesses  a finite  frequency  response 
function  and,  equivalently,  a finite-valued  first-order  Volterra  kernel,  if  it  is  BIBO 
stable  [16]. 

2.3  Higher-Order  Volterra  Operators 

The  representation  of  nonlinear  dynamical  systems  using  Volterra  series  requires 
the  inclusion  of  Volterra  operators  of  second  order  and  higher.  This  dissertation 
focuses  on  those  nonlinear  systems  that  can  be  represented  in  terms  of  the  first  and 
second-order  Volterra  operators  alone.  These  systems  fall  under  the  classification  of 
“weakly  nonlinear”  systems.  This  section  focus  on  the  properties  of  the  second-order 
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Volterra  operator,  noting  that  the  discussion  can  easily  be  extended  to  higher-order 
operators.  This  discussion  is  also  limited  to  time-invariant  systems,  although  it  is  not 
difficult  to  generalize  the  results  to  nonstationary  systems. 

Recall  that  the  output  of  the  nth-order  Volterra  operator  for  a time-invariant 
dynamical  system  takes  either  of  the  following  equivalent  forms: 


Vn 


/oo  roc 

•••/  hn(ai, . . . , an)  u(t  — o\)  ■ ■ - u(t  — an)  da\  - ■ ■ dan  (2.49) 

-oo  J — OO 


/oo  roo 

■ • ■ / hn  (t  - <ti,  . . . , t - an)  u (ffi)  • • ■ u ( an ) dax  ■ ■ ■ dan  (2.50) 

oo  J — OO 

The  second-order  Volterra  operator,  then,  can  be  written  in  either  of  the  forms 


y2  (t)  = H2  [ u (t)] 


h2  (<Ti , cr2)  u(t  — 0\)u{t  — <J2)  daida2 


(2.51) 


/oo  roc 

/ h2(t-aut-a2)u{ai)u(an)daida2  (2.52) 

oo  J — OO 

where  h2  is  the  second-order  Volterra  kernel  of  the  system.  It  is  straightforward  to 
show  that  the  second-order  Volterra  operator  is  a nonlinear  operator.  A nonlinear 
operator  is,  by  definition,  any  operator  that  is  not  linear.  Assuming  an  input  of  the 
form 


u (t)  = a v (t)  + P w ( t ) (2.53) 

where  a and  j3  are  scalars,  the  output  of  the  second  order  operator  in  equation  (2.52) 
is  given  by 
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H2[u(t)\=  / h2  (t  - aut  - <r2)  {av  (<ti)  + (3w  (cq)} 


— oo  J—  oo 


(u2)  + /3w  (<r2)}  doido2  (2.54) 


Recall  that,  in  order  for  H2  to  be  a linear  operator,  the  following  condition  must  be 
satisfied: 


It  is  clear  that  the  expressions  in  equations  (2.54)  and  (2.56)  are  not  equivalent  and 
equation  (2.55)  is  not  satisfied.  Therefore,  the  second-order  Volterra  operator  is  a 
nonlinear  operator.  The  same  result  could  have  been  obtained  using  the  expression 
for  the  second-order  operator  in  equation  (2.51).  Similarly,  the  Volterra  operators  of 
third  order  and  higher  can  also  be  shown  to  be  nonlinear. 

Nonlinear  dynamical  systems  can  be  represented  in  terms  of  Volterra  series  ex- 
pansions that  include  Volterra  operators  of  second  order  and  higher.  A given  system 
is  characterized  by  the  Volterra  kernels  that  appear  in  this  representation.  Just  as 
the  first-order  kernel  can  be  interpreted  as  an  impulse  response  function,  higher-order 
kernels  can  be  viewed  in  terms  of  the  system  response  to  multiple  impulse  inputs. 
For  example,  for  a degree  two  homogeneous  system,  different  “components”  of  the 
second-order  kernel  can  be  related  to  the  system  response  to  two  impulses  applied  at 


H-2  [av  ( t ) + (5w  (£)]  = aH2  [v  (t)]  + (3H2  [w  (t)] 


(2.55) 


Note  that 


(2.56) 
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different  times.  Indeed,  in  theory,  higher-order  Volterra  kernels  can  be  measured  by 
applying  multiple  impulses  to  the  system.  This  approach  is  rarely  practical,  however, 
mainly  due  to  the  difficulty  in  applying  impulse  inputs  to  physical  systems. 

It  should  be  noted  that  the  Volterra  kernels,  except  for  the  first-order  kernel,  are 
not  unique  for  a given  system.  It  is  often  convenient,  for  computational  purposes,  to 
impose  uniqueness  on  the  Volterra  kernels.  It  can  be  shown  that  unique  forms  for  the 
kernels  exist  in  terms  of  special  restricted  forms  such  as  the  symmetric,  triangular, 
and  regular  forms  [15].  Here,  only  the  symmetric  and  triangular  forms  are  discussed. 
A second-order  Volterra  kernel  h2  is  symmetric  if  it  satisfies 

h2  (0-1,02)  = h2  (<t2,(7i)  (2-57) 

It  can  be  shown  that  any  second-order  kernel  can  be  converted  into  the  symmetric 
form  shown  in  equation  (2.57).  As  an  example,  consider  a separable  second-order 
kernel  of  the  form 


M<Xi,ca)  = / (cn)  ^ (cr2)  (2-58) 

This  kernel  is  not  necessarily  symmetric  since,  in  general, 

The  kernel  in  equation  (2.58)  can  be  viewed  as  the  kernel  of  the  second-order  system 
that  is  formed  through  the  multiplication  of  two  first-order  systems  that  are  connected 
in  parallel,  having  first-order  Volterra  kernels  / and  g,  respectively  [15].  The  output 
of  this  system  is  obtained  through  the  multiplication  of  the  outputs  of  the  two  first- 
order  systems: 


y(t) 


/oo  r oo 

f (cq)  u (t  - ox)  dax  / g{a2)u{t-a2) 

-OO  J — OO 


dao 
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/OO  POO 

/ / (<Ti)  g (a2)  u (t  - a2)u  (t  - o-j)  dalda2 

•oo  J — OO 

/oo  /*oo 

/ h2(ai,a2)u(t- cr2)u(t- a1)da1da2  (2.59) 

-oo  J —oo 


where  h2  is  the  second-order  kernel  defined  in  equation  (2.58).  Clearly,  the  order  of 
multiplication  is  unimportant  in  equation  (2.59)  and  the  output  can  also  be  written 
as 


V CO  = 


/OO  POO 

g (o"i)  u(t  — cti)  dai  / / (cr2)  u (t  - a2)  da2 

■oo  J — oo 

/oo  /*oo 

/ g (ai)  f (a2)  u (t  - a2)u  (t  - a-i)  daida2 

-oo  — oo 

/oo  /*oo 

/ h2(a2,ai)u(t  - a2)u(t  - ai)  daida2 

oo  J —oo 


(2.60) 


where,  as  stated  before,  h2  (cr2,  ai)  = g{(Ji)  f (cr 2)  is  not,  in  general,  equal  to  h2  {(J\,cr2). 
Hence,  the  second-order  system  has  two  different  kernel  representations,  /i2(<71,cr2) 
and  h2  (cr2 , <Ti).  Clearly,  if  equations  (2.59)  and  (2.60)  are  added  together  and  divided 
in  half,  we  obtain 


/OO  POO 

/ h2iSyrn(a1,a2)u(t-a1)u(t-a2)daida2  (2.61) 

■oo  J —oo 

where  the  symmetric  kernel  h2  sym  (ai,a2)  is  defined  as 

h2]sym  (cq,<h>)  = ~ {h2  (eq,^)  + h2  (ct2,cti)}  (2.62) 

Using  equation  (2.62),  a unique  symmetric  kernel  can  be  derived  for  any  second-order 
system.  Higher-order  symmetric  kernels  can  be  obtained  using  the  following  general 
expression: 


vn,sym 


((Jj , . . . , (Tjj) 


n\  h n ’ a*(n)) 

7T() 


(2.63) 
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where  7r(-)  represents  any  permutation  of  the  integers  1 through  n.  In  general,  there 
are  n!  possible  permutations  of  the  integers  1 , ...,n  and  the  resulting  symmetric 
kernel  satisfies 


hn,sym  (^1>  • • • i &n)  — (<^7r(  1)  ? • • • > 03r(n))  (2-64) 

As  in  the  second-order  case,  the  symmetric  forms  of  the  higher-order  kernels  are 

unique  for  a given  system.  It  is  convenient  to  work  with  symmetric  kernels  because 

the  order  of  the  variables  erx, . . . , an  is  unimportant.  Since  all  Volterra  kernels  can 
be  put  into  a symmetric  form  using  equation  (2.63),  one  can  always  assume,  without 
loss  of  generality,  that  the  kernels  are  symmetric. 

The  triangular  form  of  the  nth-order  Voltera  kernel  /tn  trj  satisfies  the  property 

K,tri  (cri,  cr2,  — , crn)  = 0 for  ai+j  > aj  (2.65) 

where  i and  j are  positive  integers.  The  resulting  kernel  is  supported  over  a triangular 
domain.  This  domain  is  easy  to  visualize  for  the  second-order  kernel,  where  the  kernel 
is  supported  in  the  triangular  region  bounded  by  cy  = t,  a2  = 0,  and  Gy  = a 2-  The 
condition  in  equation  (2.65)  can  be  explicitly  incorporated  into  the  expression  for  the 
nth-order  kernel  as  follows: 


h"n,tri  (^1  > 02)  ■ ■ ■ > &n)  (®3  > 02>  • • ■ j @n)  5—1  (^3  0fj)  ' ' ' 5_X  (^n—  1 ^n)  (2.66) 

where  <5_x  represents  the  unit  step  function,  defined  as 


5-i  — aj+i) 


1 

0 


0j  — T/+ 1 
< aj+i 


(2.67) 


It  should  be  noted  that  equation  (2.66)  is  not  the  only  valid  form  for  the  triangular 
kernel.  Indeed,  any  permutation  of  the  arguments  03, . . . ,an  in  equation  (2.66)  also 
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yields  a triangular  kernel.  Therefore,  there  are  a total  of  n!  possible  forms  for  an 
nth-order  triangular  kernel.  As  will  be  seen  in  the  next  section,  triangular  forms  of 
the  kernels  naturally  appear  in  Volterra  series  representations  of  ordinary  differential 
equations. 

Because  every  Volterra  kernel  can  be  expressed  in  a unique,  symmetric  form,  it 
is  not  surprising  that  there  is  a relationship  between  the  triangular  and  symmetric 
forms  of  a kernel.  This  relationship  is  given  by 


h n,tri  (^1  > i @n)  ft!  hn  Sym  ((Jj  , (J2,  • • ■ , CTn)  <5 — 1 (<J\  (T2)  ' ' ' ^—1  (*Vi— 1 ^Vi) 

(2.68) 

The  choice  of  which  form  of  the  kernel  to  use  is  a matter  of  convenience.  In  this 
dissertation,  both  the  triangular  and  symmetric  forms  of  the  second-order  kernel  are 
considered.  Multiwavelets  that  are  easily  adapted  for  the  representation  of  symmetric 
kernels  are  derived  and  implemented.  Alternatively,  wavelets  that  are  supported 
over  triangular  domains  are  derived  and  used  to  obtain  low-order  approximations  of 
triangular  second-order  kernels. 

Just  as  in  the  first-order  case,  higher-order  Volterra  kernels  can  also  be  analyzed 
in  the  frequency  domain.  The  n-dimensional  Fourier  transform  is  defined  as 


/oo  roc 

■ l hn  (cri, . . . , an)  e-J(«i*i +"■+"»* ^dai . . .dan 

-oo  J — OO 


(2.69) 


and  the  inverse  Fourier  transform  takes  the  form 


r 00  roc 

K (eq, . . . , an)  = ——x  / • • • / Hn  (jui, . . . , jun)  ej^llTl+  '+UJn<Tn)dui . . .dujn 

(27r)  J —00  J — 00 

(2.70) 
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As  an  example,  the  second-order  Volterra  kernel  can  be  expressed  in  the  frequency 
domain  as 


/oo  /*oo 

/ h2(ai,crn)e~:,{ullC7l+ul2a2)d(j1da2  (2.71) 

-oo  J — OO 

where  H2  jcu2)  is  known  as  the  kernel  transform  of  the  second-order  kernel. 
Clearly,  if  H2  (juji,  ju>2)  is  known,  the  second-order  kernel  h2  can  be  recovered  using 
the  inverse  Fourier  transform: 


POO  POO 

h2(cr1,a2)  = ——^  / H2[jujl,juj2)e:,{uJl,Jl+IJJ2,y2)duilduj2  (2.72) 

(27T ) J — oo  J — oo 

It  should  be  noted  that,  just  as  in  the  first-order  case,  the  existence  of  finite  Fourier 
transforms  of  the  higher-order  kernels  is  dependent  upon  the  condition 


/OO  POO 

■ \hn  (cq , . . . , <rn)  | dcr ! • • • dcrn  < oo  (2.73) 

oo  J — OO 

In  Section  2.2,  it  was  shown  that,  in  the  first-order  case,  this  condition  is  both  a 
necessary  and  sufficient  condition  for  a linear  system  to  be  bounded-input,  bounded- 
output  (BIBO)  stable.  For  higher-order,  nonlinear  systems,  equation  (2.73)  is  a 
sufficient  condition  for  BIBO  stability,  but  it  has  been  shown  by  Schetzen  [16]  that 
it  is  not  a necessary  condition. 

2.4  Volterra  Series  Representations  of  Ordinary  Differential  Equations 
There  is  a distinct  relationship  between  analytic  ordinary  differential  equations 
(ODEs)  and  Volterra  series.  It  can  be  shown  that  any  ODE  can  be  converted  into  an 
integral  equation  through  the  process  of  integration.  The  solution  of  the  correspond- 
ing integral  equation  that  is  obtained  using  the  method  of  successive  substitutions, 
also  known  as  Picard  iteration,  takes  the  form  of  a Volterra  series.  In  this  section,  the 
relationship  between  ODEs  and  Volterra  series  will  be  discussed  for  several  different 
time-invariant  ODEs.  Since  an  ODE  of  any  given  order  can  be  expressed  in  terms 
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of  a first-order  state  space  equation,  the  discussion  in  this  section  is  applicable  to  all 
analytic  ODEs,  regardless  of  their  order. 

Consider  a linear,  time-invariant  ODE  of  the  form 


x(t)  = [A]x(t) + bu(t) , t>  0 


y(t)  = cTx  (() , x (0)  = Xq 


(2.74) 


where  the  input  u is  bounded  and  piecewise-continuous  over  the  domain  of  interest. 
It  is  well  known  that  the  solution  to  equation  (2.74)  is  given  by 


y(t)  = cre^tx o + cT  f e^4  a^bu(a)dcr  (2-75) 

Jo 

where  the  matrix  exponential  e^t-<T)  represents  the  state  transition  matrix  [<f>  (t,  a)] 
for  time-invariant  systems.  The  matrix  exponential  is  defined  as 

eM,  :=  y~  dlhE  = [/,  + ^ 1 {Af  t2  + ■ . . (2.76) 

n=0 

where  [I]  is  the  identity  matrix.  It  can  be  shown  that,  in  general,  the  state  transi- 
tion matrix  for  both  time-invariant  and  time-varying  systems  satisfies  the  following 
properties: 


(1) 

II 

2 

(2) 

[$  (t,a)]  [$  (cr,  t)]  = [T  (t,r)] 

(2.77) 

(3) 

^(Ccr)]"1  = [$M)  ] 

Clearly,  equation  (2.75)  shows  that  the  closed  form  solution  of  equation  (2.74)  takes 
the  form  of  a first-order  Volterra  series, 
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y (t)  = yo  {t)  + 


l 


hi  (t  — a)u  (a)  da 


(2.78) 


where  the  zero-order  term  y0  is  given  as 


Vo  (t)  = cTe[A]tx0 


(2.79) 


and  the  first-order  Volterra  kernel  takes  the  form 


= c e 


(2.80) 


As  expected,  since  equation  (2.74)  represents  a linear  system,  the  output  is  completely 
characterized  in  terms  of  the  first-order  Volterra  operator  and  the  zero-order  term. 
Note  that  if  the  initial  condition  x0  is  zero,  the  zero-order  term  vanishes.  Equation 
(2.75)  is  typically  obtained  from  the  variation  of  constants  formula.  However,  an 
alternative  solution  method,  which  generalizes  to  nonlinear  equations,  can  be  used 
to  derive  the  same  result.  This  procedure  entails  converting  equation  (2.74)  into 
a Volterra  integral  equation  and  applying  the  method  of  successive  substitutions. 
Equation  (2.74)  can  be  expressed  as  an  integral  equation  via  integration: 


Equation  (2.81)  is  a Volterra  integral  equation  of  the  second  kind  which  can  be 
solved  for  the  state  vector  x.  Note  that  the  solution  to  equation  (2.81)  appears  in 
the  integrand,  a characteristic  of  all  integral  equations.  As  demonstrated  by  Rugh 


(2.81) 


[15],  this  equation  can  be  solved  using  the  method  of  successive  substitutions,  which 
assumes  that  x{o\)  takes  the  form 


(2.82) 


Then,  substituting  into  equation  (2.81),  we  have 
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*(*)  = £o  + ^ |[^]  {-f  {[^]x(£T2)  + few  (cr2)}  dcr2|  + bu  (01 ) | doy 

/•/  p<j\ 

— Xq  + [A]  tx0  + [A]  bu  (er2)  da2doi 

Jo  Jo 


+ 


r r ri  2 

/ (<Ji)  do i + / / [^4]  x(a2)da2dai 

Jo  Jo  Jo 

Using  equation  (2.82),  x(a2)  can  be  expressed  as 

r°2 

x{&2)  = Xjo  + {[A]x(a3) + bu(a3)}  dcr3 

Jo 

and  substituted  into  equation  (2.83)  to  obtain 


(2.83) 


no  i rt 

[A]  b u (cr2)  da2d(j\  + bu{a\)  da\ 

Jo 

ft  f<r  i f per  2 'i 

+ J J [A¥  \%o  + J {[A]x(o3) -\-bu(a3)}  da3\da2dox 

= {[/]  + [>1]  t + \ [A]2 12}  x0  + f ' bu(ai)d(Ji 

Jo 

ft  pen  ft  ft Tl  ptT2 

+ {A\bu(a2)dcr2d(Ti+  / / / [A]2  bu  (a3)da3da2dai 

Jo  Jo  Jo  Jo  Jo 

+ [ [ [ [A]3  x(a3)da3da2dai  (2.84) 

Jo  Jo  Jo 


Note  that  the  third  and  fourth  integrals  in  equation  (2.84)  can  be  simplified  by 
changing  the  order  of  integration.  For  example,  the  third  integral  can  be  rewritten 
as 


[A]  b u (cr2)  da2da\ 


n[A]  b u (cr2)  d<7\da- 
2 

[ [A\  (<  - a2)  b u (a2)  a 
Jo 


(2.85) 


Similarly,  the  fourth  integral  can  be  simplified  to  the  form 
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[ [ [ [A]2  b u (a3)da3da2dax  = f [A\2  (t  - a3)2  b u (a3)  da3  (2.86) 

Jo  Jo  Jo  Jo 

Using  these  results,  equation  (2.84)  can  be  written  as 


x(t)  = {[/]  + [A}t+  \ [Aft2]^ 

+ [ {[I)  + [A](t  - oi) + \[A\2  (t  - ai)2}bu(oi)dai 

Jo 

rt  rai 

+ / [A]3  x(a3)da3da2da i (2.87) 

Jo  Jo  Jo 

It  can  be  demonstrated  that,  as  the  number  of  iterations  in  the  method  of  successive 
substitutions  approaches  infinity,  the  last  term  (the  term  involving  the  state  vector) 
approaches  zero  [15].  Using  this  fact  and  the  definition  of  the  matrix  exponential 
given  in  equation  (2.76),  as  the  number  of  iterations  approaches  infinity,  we  obtain 


e[A]{t-ai)bu{al)dal 

and,  finally, 

U {t)  = cTx  (t)  = cTe^tx o + cT  f u (cq)  dox  (2.88) 

Jo 

This  result  is  identical  to  the  solution  given  by  equation  (2.75). 

It  has  been  shown  that,  by  transforming  an  ordinary  differential  equation  into 
an  integral  equation  and  applying  the  method  of  successive  substitutions,  a Volterra 
series  representation  of  the  solution  can  be  obtained.  In  the  previous  example,  this 
procedure  was  applied  to  a linear  system,  and  the  resulting  series  consisted  only  of 
the  zero-order  term  and  the  first-order  Volterra  operator.  Volterra  series  representa- 
tions of  nonlinear  systems  include  higher-order  terms  and  are  generally  infinite  series. 
Consider  a bilinear  system  of  the  form 


x (t)  = e^x0  + f 
Jo 
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x ( t ) = [>1]  x ( t ) + [D]  x(t)u(t)  + bu(t),  t > 0 


V (t)  = cT x ( t ) , x (0)  = Xq 


(2.89) 


where  the  input  u is  bounded  and  piecewise-continuous  over  the  domain  of  interest. 
Rugh  [15]  showed  that  the  evaluation  of  equation  (2.89)  can  be  simplified  via  a change 
of  variables: 


z (t)  = e~[A]tx  (t)  =►  x (t)  = e[A]tz  (t)  (2.90) 

Note  that 

= jt{[l]  + [A}t+\\A\2e+\\Af  ? + ■■■} 

= [A\  + [A]2t  + \[A]3t3 + ■■■ 

= lA]{[I}  + [A}t  + \[A}2t>  + ...} 

= [A]e^ 

Then,  differentiating  equation  (2.90),  we  have 


x(t)  = \A]f)A'‘z(t)  + 


(2.91) 


Substituting  for  x(t)  and  x(t)  in  equation  (2.89),  we  obtain 


[A]  eM*z  (t)  + e^z  ( t ) = [A]  e^z  ( t ) + [D]  e^z  ( t ) u(t)  + bu  (t) 


Mt 


[A]t, 


which  can  be  simplified  to  give  the  following  equation  for  the  new  variable  z: 


z(t)  = e [D]  e^*z  (t)  u (t)  + e ^ bu(t ) 


(2.92) 
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Equation  (2.92)  can  be  integrated  to  obtain  the  integral  equation 


z (t)  — z0+  f e [D]  e^aiz  (<Ji)  u (a\)do\  + f e ^aib  u (<ti)  day  (2.93) 
Jo  Jo 

Applying  the  method  of  successive  substitutions  to  solve  this  integral  equation,  it  is 
assumed  that  z[o\ ) takes  the  form 


2;  (cq)  = z0  + [ e [a1<T2  [ D } e^a2z  ( a2 ) u ( a2)da2  + f e a2b  u (a2)  da2 
Jo  Jo 

Then,  substituting  into  equation  (2.93),  we  obtain 


z(t)  = z0  + f e [D]  e^aiz0u  (oy)  day  + f e ^aibu  (cri)  dai 
Jo  Jo 

+ f f e~^ai  [D\e^CT1e~^a2bu(a2)u(ai)da2dai 

Jo  Jo 

+ r fe 

Jo  Jo 


(2.94) 


[Ajcri  e[A}<rie  [A]a2  [£)]  elAl a^z  (a2)  u (a2)  u (ay)  do^doy 


Note  that,  from  property  (3)  of  the  state  transition  matrix, 


e\A\<ne-\A\°2  = [$  (ai,  0)]  [$  (0,  o2)\  = [$  (a,,  0-2)]  = 

Then,  equation  (2.94)  can  be  simplified  to  the  form 


z(t)  = z0+  f e [D]  c^'ZqU  (ay)  da  1 + f e ^aibu  (oy)  doy 
Jo  Jo 

+ f f e~^Cl  [D]  e^ai~a2^bu  (a2)  u (ai)  da2dai  (2.95) 

Jo  Jo 


+ r r\ 

Jo  Jo 


[A]^  j ef4](<ri  cr2)  e[ A\a^z  (cr2)  u (<r2)  u (ay)  do^doy 


Continuing  with  the  successive  substitutions,  z(o2)  is  assumed  to  take  the  form 
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1 02)  = z0  + 

Jo 

Then,  we  have  the  following  expression  for  z\ 


r&2 


e ^(T3[D}e^cr3z(a3)u(a3)da3  + I e ^a3b  u (a3)  dcr3 
o Jo 


z(t ) = z0+  [ e-[A]<r'[D]e[A],Jlz0u{al)da1+  f e~[A^bu  (a x)  da  1 
Jo  Jo 


+ [ [ 1 e~Wai  [D]  [ D ] e^a2z0u  (a2)  u (ax)  da2dax 

Jo  Jo 

+ f r\- 

Jo  Jo 


'[‘4Icri  [D]  e^CTl  ^ b u (<72 ) u (<7i)  da2dax 


02 


+ 


ft  fir  1 f 

Jo  Jo  Jo 


ft  r-<J\  [‘02 


1 0 70  7o 


-[A}°1  [£)j  e[A](<7i-(72)  [£)]  e[-4](o-2— cr3)^ 

u (cr3)  u (a2)  u (cq)  da3da2dax 


(2.96) 


3-[A]cti 


[D]e 


[i4](<7l-0-2) 


[A]  (02-0-3) 


[D]e 


[A]ct3 


2 (<t3)  u (cr3)  u (o-2 ) u (cq)  da3da2dax 


Just  as  in  the  previous  example,  as  the  number  of  iterations  in  the  method  of  succes- 
sive substitutions  is  increased,  the  last  term  approaches  zero.  Using  equation  (2.90) 
to  transform  the  state  variable  back  to  x , and  using  the  output  equation  given  in 
equation  (2.89),  we  obtain  the  following  expression  for  the  output  of  the  bilinear 
state  equation: 


y(t)  = 


cTeW 


x0  + cT  [ e^(t  CTl)  [D]  b^^XqU  (eq)  dax 
Jo 

+cT  f e^^^bu  ((7i)  deq 
Jo 


00  pt  rcr\ 

+7E 


k— 2 


r<rk- i 


[A](t-ai)  |-£)]  p[A](<n-a2) 


(2.97) 


0 70 


[D\  e[A]crkx0u  (crfc) . . . u (cq)  dak  . . . dcq 
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e[A](f-<ri)  j g[A](cri— <r2) 

[D]  e^ak~1~ak^bu  (ak) . . . u (<Ti)  dak  ■ ■ ■ da\ 


Equation  (2.97)  is  a rather  complicated  Volterra  series  representation  of  the  output 
of  the  bilinear  system.  If  the  initial  condition  x0  is  zero,  equation  (2.97)  takes  a much 
simpler  form.  Rugh  [15]  has  shown  that,  through  an  appropriate  change  of  variables, 
a given  state  equation  with  nonzero  initial  conditions  can  be  transformed  into  one 
with  zero  initial  conditions.  If  the  initial  conditions  are  zero,  equation  (2.97)  reduces 
to 


It  is  not  difficult  to  show  that  equation  (2.98)  is,  indeed,  in  the  form  of  a Volterra 
series.  Recall  that  a Volterra  series  representation  of  a nonlinear  system  can  be 
written  as 


(2.98) 


[D]  e Ck">b  u (ofc)  • • • u (<7i)  dak  ■ • • da\ 


(2.99) 


Note  that,  from  the  properties  of  the  state  transition  matrix, 


gpiKo-fc-i-ff*,)  __ 


[$(**_!,*)]  [$  (t,ak)]  = 


Then,  equation  (2.98)  can  be  written  as 
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y(t)  = f cre[A1(t  a^bu(cri)  dai  (2.100) 

Jo 

+ [ f 1 cTe[A^~^  [D]  e-M(*-^i)e[ 4(*-«a)6u  (a2)u  (ai)  + • • • 

Jo  Jo 

Clearly,  equation  (2.100)  is  of  the  same  form  as  the  Volterra  series  given  in  equation 
(2.99),  where  the  first  and  second-order  Volterra  kernels  are  given  by 


hi  (cq)  = cTe^aib 


(2.101) 


h2(ai,a2)  = cTe[A]ai  [D}e~[A]aie[A]a2b 

= cTelA]ai[D}e[A](a2-ai)b  (2.102) 

The  same  technique  that  was  used  to  obtain  Volterra  series  representations  of 
the  output  of  the  linear  and  bilinear  state  equations  can  be  used  to  generate  Volterra 
series  for  general  time-invariant,  nonlinear  ODEs  of  the  form 


x(t)  = l(x(t)  ,u(t)) , t>  0 
y{t)  = h(x(t)),  x(0)  = xo 


(2.103) 


It  is  interesting  to  note  that  ODEs  naturally  give  rise  to  Volterra  series  representations 
that  are  composed  of  kernels  that  are  in  the  triangular  form  described  in  the  previous 
section.  It  should  also  be  noted  that  the  method  of  successive  substitutions  is  not 
the  only  technique  available  for  deriving  analytical  kernels  from  ODEs.  Ku  and 
Wolf  [9]  used  a Taylor  series  expansion  of  the  nonlinear  terms  in  ODEs  to  derive 
analytical  forms  of  the  kernels.  Fliess  et  al.  [12]  used  iterated  integrals  to  derive  a 


50 


general  expression  for  the  Volterra  kernels  corresponding  to  nonlinear  ODEs.  As  a 
final  example,  Schetzen  [16]  introduced  the  concept  of  the  pth-order  inverse  in  order 
to  invert  the  pth-order  Volterra  operator  and  determine  the  corresponding  kernel. 

2.5  Existence,  Uniqueness,  and  Convergence  Issues 
A review  of  Volterra  series  would  be  incomplete  without  some  discussion  of  the 
existence,  uniqueness,  and  convergence  of  these  representations.  It  is  important  to 
distinguish  which  types  of  systems  are  amenable  to  Volterra  series  representations 
and  whether  or  not  such  representations  are  unique.  Furthermore,  it  is  essential 
to  consider  the  convergence  properties  of  Volterra  series  in  order  to  apply  them  to 
the  analysis  of  dynamical  systems.  These  fundamental  issues  are  addressed  in  this 
section. 

Boyd  and  Chua  [8]  established  that  a system  must  possess  fading  memory  in 
order  to  be  represented  in  terms  of  a Volterra  series.  While  the  concept  of  fading 
memory  was  alluded  to  in  the  early  works  of  Volterra  [7]  and  Wiener  [13],  Boyd  and 
Chua  were  the  first  to  provide  a formal  definition  of  the  term. 

Definition  2.1  [8]:  An  operator  N has  fading  memory  on  a subset  K of  C(R),  the 
set  of  all  bounded  continuous  real-valued  functions,  if  there  is  a decreasing  function 
w : R (0, 1],  lim  w(t)  = 0,  such  that  for  each  u G K and  e > 0,  there  is  a 5 > 0 

t— >oo 

such  that  for  all  v € K , 

sup  | u(t)  — v(t)\w(—t)  < S =>  |lVii(0)  — AT(0)|  < e (2.104) 

t<  o 

In  this  definition,  time  t = 0 corresponds  to  the  present  time.  In  words,  Definition 
2.1  states  that  a system  exhibits  fading  memory  if  two  inputs  signals  that  are  close 
in  the  recent  past,  but  not  necessarily  close  in  the  remote  past,  result  in  close  present 
outputs.  The  weighting  function  w(—t)  approaches  0 as  the  time  approaches  — oo  to 
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indicate  the  fading  influence  of  past  inputs.  For  a fading  memory  system,  an  input 
occurring  in  the  distant  past  will  eventually  have  no  influence  on  the  present  output. 
Boyd  and  Chua  [8]  noted  that  fading  memory  corresponds  to  a stronger  requirement 
than  continuity,  which  states  that  input  signals  that  are  close  over  all  past  time  result 
in  close  present  outputs.  They  also  remarked  that  for  linear  time-invariant  systems, 
fading  memory  is  equivalent  to  having  a convolution  representation. 

Originally,  approximation  results  for  Volterra  series  representations  of  time- 
invariant  systems  were  restricted  by  the  limitations  of  the  Stone- Weierstrass  theo- 
rem. In  short,  these  limitations  restricted  the  approximation  results  to  input  signals 
defined  on  a finite  time  interval  and  applied  to  the  approximation  of  an  operator  N 
over  a finite  time  interval.  The  fading  memory  concept  enabled  Boyd  and  Chua  [8] 
to  state  an  approximation  theorem  that  was  free  of  these  constraints: 

Theorem  2.1  [8]:  Let  e > 0 and 


K :=  { u e C(K)  : |u(f)|  < Ml5  |u(s)  — u(t) | < M2(s  — t)  for  t < s}  (2.105) 

Suppose  that  N is  any  time-invariant  operator  with  fading  memory  on  K.  Then, 
there  is  a finite  Volterra  series  operator  N such  that,  for  all  u e K, 

IlNu-Milloo  < e (2.106) 

Note  that  the  oo-norm  of  a function  / is  defined  as 

ll/lloo  :=  sup  \f(t)\  (2.107) 

teR 

Theorem  2.1  basically  states  that  a time-invariant  operator  with  fading  memory, 
subject  to  a bounded  input  excitation,  can  be  approximated  to  arbitrary  accuracy  in 
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terms  of  a truncated  Volterra  series  representation.  By  invoking  the  requirement  of 
fading  memory,  the  theorem  is  not  subject  to  any  of  the  previous  restrictions. 

Boyd  et  al.  [10]  addressed  the  issue  of  uniqueness  of  Volterra  series  representa- 
tions. They  provided  the  following  uniqueness  theorem,  which  is  stated  here  without 
proof: 

Theorem  2.2  [10]:  Suppose  N and  M are  Volterra  series  operators  with  kernels  hn 
and  gn , respectively.  Then  N — M if  and  only  if  hn^ym  = gn,sym  for  all  n. 

Theorem  2.2  states  that  two  Volterra  series  representations  are  equivalent  if  and  only 
if  the  corresponding  symmetric  kernels  are  equivalent.  As  noted  earlier,  although 
Volterra  kernels  can  be  expressed  in  different  forms,  the  symmetric  form  of  the  kernels 
is  always  unique.  Therefore,  Theorem  2.2  assures  that  the  set  of  Volterra  kernels  that 
characterize  a given  system  are  unique  to  that  system. 


The  convergence  of  Volterra  series  was  addressed  in  the  following  theorem  from 
Ku  and  Wolf  [9]: 

Theorem  2.3  [9]:  Let 


Vn{t)  = ■ hn(ai, . . . , crn)u(t  — cq)  • • • u(t 


an)dax  • • • dan  (2.108) 


o Jo 


Then,  for  a bounded  input  function  u that  satisfies 


u[t) | < M Viel 


(2.109) 


for  some  constant  M > 0,  we  have 


OO 


OO 


(2.110) 


71=1 


71=1 


where 
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poo  poo 

an=  •■•  / \hn(ai_,...,an)\d(Tl---dan  (2.111) 

Jo  Jo 

The  proof  of  Theorem  2.3  follows  a similar  format  as  the  proofs  of  the  stability 
results  given  earlier  in  this  chapter.  Theorem  2.3  implies  that,  similar  to  Taylor 
series  expansions,  a radius  of  convergence  for  Volterra  series  can  be  defined  as  [9] 

p = ( lim  sup  |an|1/nN)  (2.112) 

\n—>oo  J 

Then,  absolute  convergence  of  the  Volterra  series  is  guaranteed  for  any  input  that 
satisfies 


\\u\\oo  < p (2.113) 

Boyd  et  al.  [10]  presented  a slightly  different  form  of  Theorem  2.3  that  discussed 
convergence  of  the  Volterra  series  in  terms  of  a gain  bound  function  /,  defined  as 

OO 

f(x)  '■=  ^2  IIMoo  xn  (2.114) 

71=1 

As  noted  by  Boyd  et  al.  [10],  Theorem  2.3  can  be  used  to  compute  an  error  bound 
for  a truncated  Volterra  series.  Denoting  H as  the  infinite  Volterra  series  and  H ^ 
as  the  truncated  Volterra  series 


* poo  poo 

Hlk)u(t)  = Y^  ■■■  hn(au. . . ,an)u(t 
n=i  -yo  Jo 

the  truncation  error  is  bounded  as 


<7i)  • • • u(t  — an)d<j\  ■ ■ ■ don  (2.115) 


OO 

\\Hu- H{k)u\\  < £ OJM& 

n=k+ 1 


(2.116) 
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It  then  follows  that  the  truncation  error  is  o(||u||fe)  [10].  As  a final  note,  Sandberg  [11] 
has  addressed  the  convergence  properties  of  the  “doubly  finite”  Volterra  series  in 
which,  in  addition  to  truncating  the  series,  the  kernels  are  also  truncated  after  a finite 
period  of  time.  This  assumption  is  frequently  employed  in  practice  and  is  consistent 
with  the  concept  of  fading  memory,  which  dictates  that  the  Volterra  kernels  of  a 
fading  memory  system  should  decay  to  zero  in  a finite  period  of  time.  Sandberg  has 
also  investigated  the  convergence  of  discrete-time  Volterra  series  representations  [97] 
and  the  convergence  of  Volterra  series  for  time- varying  nonlinear  systems  [98] . 


CHAPTER  3 

REVIEW  OF  WAVELETS 


This  chapter  presents  an  overview  of  wavelets  and  multiresolution  analysis.  Be- 
cause this  field  is  relatively  new,  this  chapter  is  intended  to  be  a self-contained  review 
of  some  of  the  basic  properties  of  wavelets,  as  well  as  how  they  are  used  in  the  ap- 
proximation of  functions  and  signals.  In  particular,  the  decomposition  formulas  that 
are  implemented  in  the  calculation  of  the  discrete  wavelet  transform  are  derived  for 
a number  of  different  wavelet  bases.  Section  3.1  gives  an  introduction  to  wavelets 
and  multiresolution  analysis.  Then,  in  Section  3.2,  wavelets  are  compared  to  two 
commonly  used  tools  for  the  approximation  of  functions,  Fourier  analysis  and  the 
finite  element  method.  In  Section  3.3,  the  Haar  wavelet  is  introduced  as  a prototype 
wavelet  for  which  many  general  wavelet  properties  are  easy  to  visualize.  In  Section 
3.4,  the  discussion  is  extended  to  include  all  orthonormal  wavelet  families.  Then,  con- 
tinuing in  order  of  increasing  generality,  orthonormal  multiwavelets,  wavelets  that  are 
generated  from  multiple  scaling  functions,  are  considered  in  Section  3.5.  Finally,  in 
Section  3.6,  tensor  product  wavelets  are  introduced  as  a means  of  approximating  two- 
dimensional  and  higher-dimensional  functions.  The  material  in  this  chapter  forms  a 
large  portion  of  the  background  that  will  be  useful  later  in  deriving  wavelet-based 
representations  of  first  and  second-order  Volterra  kernels. 

3.1  Wavelets  and  Multiresolution  Analysis 
In  simple  terms,  a wavelet  is  a compactly-supported,  oscillatory  function  that  is 
typically  constructed  to  satisfy  additional  properties  such  as  smoothness,  orthogonal- 
ity, and  regularity  conditions.  A wavelet  basis  is  composed  of  the  scaled  translates 
and  dilates  of  the  wavelet  ip,  known  as  the  “mother  wavelet” , and  forms  a basis  for 
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the  Hilbert  space  L'2( R).  The  space  L2(R)  contains  all  square-integrable  functions, 
or  functions  that  satisfy 


I \f(x)\2dx  < oo  (3.1) 

R 

This  space  has  an  associated  inner  product  defined  as 

(f’9)L*m  '■=  J f(x)g(x)dx  (3.2) 

R 

where  the  overbar  denotes  complex  conjugation.  Every  wavelet  0 has  an  associated 
scaling  function  0.  The  scaling  function  and  wavelet  pair  (0, 0)  are  derived  from  the 
solution  of  the  two-scale  equations: 


0(x)  = V2  as0( 2x  — s) 

S 

(3.3) 

0(x)  = v/2  ^ bscf)(2x  — s ) 

(3.4) 

S 


where  {as}  and  {bs}  are  constant  scaling  function  and  wavelet  filters.  These  equations 
state  that  the  scaling  function  0 and  the  wavelet  0 are  formed  as  linear  combinations 
of  scaling  functions  that  have  been  contracted  to  double  the  frequency  (or  half  the 
support).  Because  the  scaling  functions  are  used  to  form  wavelets,  they  are  also 
known  as  generators. 

By  construction,  the  scaling  functions  form  a multiresolution  analysis  for  the 
analysis  of  functions  and  signals  in  L2(R).  Conceptually,  a multiresolution  analysis 
decomposes  a function  into  components  that  show  varying  amounts  of  detail,  similar 
to  the  effect  achieved  when  one  views  an  object  through  binoculars  and  varies  the 
magnification  power.  More  rigorously,  a sequence  of  subspaces  {Vj}jez  C T2(R), 
where  Z is  the  set  of  all  integers,  forms  a multiresolution  analysis  provided  that 
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(1) 

■ • • V-2  c VTj  C Vo  C Vi  C V2  ■ ■ ■ 

(2) 

z> 

II 

je  Z 

(3) 

U v)  = Ll  (») 

(3.5) 

je  z 

(4) 

/ (x)  € Vj  f (2x)  € Vj+i 

(5) 

f (x)  eVj=>  f(x-k)eVj  Vfcez 

(6) 

The  family  of  integer  translates  {< f>(x  — k)}kez  forms 

a Riesz  basis  for  V0 

A multiresolution  analysis  is  obtained  by  forming  a series  of  nested  spaces  V)_ i C 
Vj,  j € Z.  These  spaces  are  generated  from  the  scaled  translates  and  dilates  of  the 
scaling  function  0.  Later  in  this  chapter,  the  more  general  case  in  which  the  spaces 
{Vj}  are  generated  from  a set  of  r scaling  functions  {01, . . . , 0r}  will  be  discussed. 
For  the  single-generator  case,  the  space  V0  is  defined  as  the  span  of  integer  translates 
of  0: 


Vo  :=  span  {0(-  - k)}keZ  (3.6) 

In  general,  the  space  Vj  is  defined  as 

Vj  ■=  span  {(f)j,k}keZ  (3.7) 

where  each  function  0^*.  is  defined  as 

0j,fc(®)  :=  2 J'/20  (2-^x  — A:)  (3.8) 

From  equation  (3.8),  it  is  clear  that  the  functions  {0j,/J  are  formed  as  scaled  dilates 
and  translates  of  the  scaling  function  0.  The  integer  j is  known  as  the  dilation  index 
and  the  integer  k represents  the  translation  index.  The  2^2  term  is  a normalization 
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Figure  3.1:  Nested  Spaces  {Vj}  C L2(R) 


factor  that  ensures  that  the  L2(R)  norm  of  each  scaling  function  is  one.  The  L2(R) 
norm  of  a function  / is  given  by 


It  is  clear  that,  as  the  index  j is  increased,  the  scaling  functions  that  span  the  corre- 
sponding space  Vj  are  supported  on  increasingly  finer  grids.  These  grids  are  dyadic 
in  nature,  since  increasing  j by  one  gives  a grid  that  has  twice  the  resolution.  As 
j — > oo,  the  scaling  functions  are  supported  on  an  infinitesimal  grid.  In  the  limit, 
any  function  in  L2(R)  can  be  represented  exactly.  Thus,  as  j — > oo  the  spaces  {Vj} 
approach  L2(R).  Conversely,  as  j is  decreased,  the  spaces  {Vj}  are  spanned  by  scal- 
ing functions  that  have  increasingly  wider  support.  As  j — > — oo,  the  spaces  {Vj} 
approach  {0},  the  space  containing  only  the  zero  function.  The  reason  for  this  is 
that,  as  j — > — oo,  the  scaling  functions  are  supported  on  an  infinite  domain.  Thus, 
to  this  point,  it  is  clear  that  the  spaces  {Vj}  satisfy  the  first  three  multiresolution 
conditions.  These  conditions  are  summarized  in  Figure  (3.1). 

By  construction,  the  fourth  and  fifth  requirements  for  a multiresolution  analysis 
are  satisfied.  If  a function  f[x)  is  contained  in  the  space  Vj,  the  function  /( 2x)  must 
be  contained  in  the  space  Vj+\  . This  is  apparent  because  the  space  V}+1  is  spanned  by 
scaling  functions  with  double  the  frequency  as  those  that  span  Vj.  The  fifth  property 
states  that  if  / is  contained  in  Vj,  the  integer  translates  of  / are  also  contained  in 


1 


(3.9) 
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Vj.  This  is  true  because  the  space  Vj  is  spanned  by  all  integer  translates  of  the 
dilated  scaling  function  </>( 2^x).  The  final  requirement  for  a multiresolution  analysis 
is  that  the  family  of  integer  translates  {</>( x - k)}kez  forms  a Riesz  basis  for  V0.  This 
requirement  is  satisfied  if  the  family  of  translates  {(p(x  — k)}  span  V0  and  there  exist 
constants  0 < A < B < oo  such  that,  for  all  sequences  {pkjkez  £ L2(Z), 


\Pk\2  < 


^2pk(f){x  - k ) 


< \Vk[ 


(3.10) 


(see  Ref.  [75]).  It  will  be  noted  here,  without  proof,  that  scaling  functions  satisfy 
equation  (3.10),  commonly  known  as  the  frame  condition.  In  particular,  if  the  family 
of  translates  {(p(x  — k)}  forms  an  orthonormal  basis  for  Vq,  then  equation  (3.10) 
becomes  an  equality  with  A = B = 1 [99]. 

At  this  point,  the  relationship  of  the  wavelet  ^ to  a multiresolution  analysis  has 
not  yet  been  clarified.  By  construction,  wavelets  span  the  orthogonal  complement 
spaces  {Wj}  defined  as  the  differences  between  two  consecutive  spaces  {Vj}: 


Wj  :=  Vj+l  0 Vj 


The  wavelet  space  Wj  is  defined  as 


(3.11) 


Wj:—span{x^jk]k£i  (3-12) 

The  notation  ijjj^  represents  the  scaled  translates  and  dilates  of  the  wavelet  ip: 

ipjtk(x)  :=  2j/2tp  ( 2jx  — k ) (3.13) 

From  equation  (3.11),  it  is  apparent  that  a given  wavelet  space  Wj  provides  the  details 
that  are  present  in  a representation  of  a function  in  V)+i,  but  are  not  present  in  the 
coarser  representation  of  the  function  in  Vj.  In  practice,  a fine-scale  space  Vj  can  be 
decomposed  as 
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Vj  = Vj—i  ® Wj^  (3.14) 

Clearly,  by  extension,  the  space  V)_i  can  be  decomposed  as 

Vj-1  = Vj-2®Wi-2  (3.15) 

and  equation  (3.14)  becomes 

Vj  = Vj—2  ® Wj- 2 ® Wj- 1 (3.16) 

Through  a recursive  application  of  equation  (3.14),  the  following  multilevel  decom- 
position of  Vj  is  obtained: 

Vj  = Vjo  ® Wjo  ® Wjo+i  ® • • • ® Wj-2  © Wj.  1 (3.17) 

where  jo  represents  the  coarsest  level  used  in  the  decomposition.  Therefore,  the 
space  Vj  is  equivalent  to  a sum  of  wavelet  spaces  corresponding  to  a number  of 
different  resolution  levels  plus  a coarse- scale  space  V)0.  In  the  limit  as  jo  — > — oo,  the 
decomposition  of  Vj  takes  the  form 


l-i 

Vi  = ® W,  (3.18) 

/ = — OO 

Furthermore,  as  j — » oo,  Vj  — >•  L2(R).  In  that  case,  we  have 

OO 

L2(R)  = 0 Wt  (3.19) 

l=— OO 

Therefore,  the  wavelets  that  collectively  span  the  spaces  {Wj}je z form  a basis  for 

L2(R). 
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In  practice,  a wavelet-based  multiresolution  analysis  of  a function  / is  obtained 
by  first  forming  a single-scale  approximation  of  the  function  in  terms  of  the  scaling 
functions  that  span  Vr  This  single-scale  approximation  takes  the  form 


fjix)  ~ 'y  "j  (3.20) 

k 

where  {ajtk}  represent  constant  coefficients  in  the  expansion.  The  notation  f3  denotes 
an  approximation  of  / in  V).  The  calculation  of  the  coefficients  {a,*.}  is  particularly 
straightforward  if  the  scaling  functions  {4>jtk}  are  mutually  orthonormal.  The  scaling 
functions  are  mutually  orthonormal  if 

— 5k,m  (3.21) 

E 

where  Sk>m  denotes  the  Kronecker  delta,  defined  as 

{0  k ^ m 

(3.22) 

1 k = m 

In  this  case,  the  coefficients  {aj)k}  can  be  calculated  as 

OiJtk  = J f{x)<j>jjk(x)dx  (3.23) 

E 

In  other  words,  if  the  scaling  functions  are  orthonormal,  the  terms  in  the  single- 
scale approximation  of  / are  simply  the  projections  of  / onto  the  various  scaling 
functions  {4>3lk}-  In  many  cases,  the  integrals  in  equation  (3.23)  are  not  convenient 
to  evaluate.  Indeed,  for  some  wavelet  families,  the  functions  are  fractal  in  nature  and 
have  no  known  analytical  form.  Therefore,  it  is  common  in  practice  to  assume  that 
the  single-scale  coefficients  are  simply  scaled  samples  of  the  function  or  signal  /: 


aj,k  ~ /( 2 Jk) 


(3.24) 
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Equation  (3.24)  gives  a reasonable  approximation  of  the  coefficients  if  j is  large  be- 
cause the  scaling  functions  tend  to  behave  as  Dirac  delta  functions  when  they  are 
supported  on  fine-resolution  grids. 

Equation  (3.17)  implies  that  there  is  an  equivalent,  multiscale  representation  of 
fj  given  by 


This  multilevel  expansion  is  in  terms  of  wavelets  on  levels  jo  through  j — 1 and  scaling 
functions  on  level  j0  only.  The  scaling  function  coefficients  on  the  coarsest  level  repre- 
sent a coarse-scale  averaging  of  the  function  /.  Meanwhile,  the  wavelet  coefficients  on 
the  different  levels  represent  details  of  varying  resolution.  These  multiscale  wavelet 
coefficients  are  known  as  the  discrete  wavelet  transform  of  the  function  fj.  Note 
that,  due  to  the  compact  support  of  the  wavelets,  the  multilevel  representation  gives 
information  not  only  about  the  frequency  content  of  the  function,  but  also  spatial 
information.  In  other  words,  the  wavelet  coefficients  are  localized  in  space  and  give 
information  about  where  certain  features  appear  in  the  function.  This  property  of 
spatial  localization  renders  wavelets  especially  effective  (and  efficient)  for  analyzing 
functions  or  signals  that  exhibit  transient  behavior  or  have  sharp  features  or  discon- 
tinuities. More  will  be  said  about  this  in  the  next  section  in  comparing  wavelets 
to  Fourier  analysis.  The  multiscale  expansion  in  equation  (3.25)  is  often  a very  ef- 
ficient representation  in  that  many  of  the  coefficients  are  close  to  or  equal  to  zero. 
Often  times,  a function  can  be  accurately  represented  using  relatively  few  wavelet 
coefficients.  Furthermore,  the  discrete  wavelet  transform  can  be  calculated  in  a fast, 
efficient  manner  from  the  single-scale  coefficients  in  equation  (3.20).  Later,  this  will 
be  shown  to  be  a fast-filter  operation  and  the  relevant  decomposition  formulas  will 
be  derived  for  various  wavelet  families. 


(3.25) 
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To  this  point,  there  has  not  been  much  discussion  as  to  how  wavelets  are  de- 
signed. Indeed,  there  are  a large  number  of  wavelet  families  which  are  designed  to 
exhibit  various  properties  such  as  smoothness,  approximation  order,  and  orthogonal- 
ity. Hence,  there  are  a wide  variety  of  approaches  to  wavelet  design.  One  feature 
common  to  all  wavelet  families  is  that  the  scaling  function  and  wavelet  pair  (0,  ip) 
must  satisfy  the  two-scale  equations.  This  can  be  demonstrated  by  considering  the 
following  decomposition  of  the  space  V\: 


Hi  = Ho  © Wo 

(3.26) 

where  the  spaces  are  defined  as 

Hi  :=  span  {(p\,k}keZ  = span  j a/20  (2(-)  - fc)} 

(3.27) 

H0  :=  span{(P(-  - k)}keZ 

(3.28) 

Wo  :=  span  {ip(-  - k)}keZ 

(3.29) 

From  equation  (3.26),  it  is  clear  that  Ho  C Hi  and  W0  C Hi.  Therefore,  the  functions 
that  form  a basis  for  Vo  can  be  written  as  a linear  combination  of  the  functions  that 
span  Hi.  The  same  is  true  for  the  wavelets  that  span  Wo.  Hence,  a scaling  function 
and  wavelet  pair  (0,  ip)  must  satisfy  the  following  two-scale  equations: 

0(x)  = ^2  as<P\,s{x)  = V2  ^2  as0( 2x  - s) 

S S 

(3.30) 

'lP(x)  = ~^2  ^i,s(x)  = ^2  bs(p(2x  ~ s) 

(3.31) 

S S 
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Note  that  equation  (3.8)  has  been  used  to  expand  the  functions  {01]S}.  The  constants 
{ as } and  {bs}  are  known  as  the  scaling  function  and  wavelet  filters,  respectively.  Every 
scaling  function  and  wavelet  pair  has  a unique  set  of  filters  which  essentially  define 
the  functions.  The  number  and  range  of  filter  coefficients  is  dependent  on  the  support 
of  the  particular  scaling  function  or  wavelet,  which  varies  among  wavelet  families. 

All  wavelet  families,  by  design,  must  satisfy  the  two-scale  relationships.  Hence, 
the  design  of  a scaling  function  and  wavelet  pair  entails  finding  solutions  to  equations 
(3.30)  and  (3.31).  In  addition,  it  is  usually  desired  that  the  resulting  functions  satisfy 
additional  conditions  such  as  smoothness,  approximation  order,  and  orthogonality. 
These  properties,  as  well  as  the  two-scale  equations,  impose  constraints  on  the  filter 
coefficients.  Therefore,  one  design  approach  is  to  find  an  appropriate  set  of  filters 
that  satisfy  the  relevant  constraints.  The  filter  coefficients  can  be  viewed  as  degrees 
of  freedom  in  the  design  process.  As  the  number  of  filter  coefficients  that  characterize 
a particular  wavelet  family  is  increased,  the  resulting  functions  can  be  designed  to 
satisfy  an  increasing  number  of  properties.  However,  increasing  the  number  of  co- 
efficients also  increases  the  support  of  the  functions,  which  is  usually  not  desirable. 
Therefore,  the  goal  is  to  find  a minimal  set  of  filters  that  meet  the  design  constraints. 
Once  these  filters  are  found,  the  resulting  scaling  function  and  wavelet  pair  is  derived 
from  the  two-scale  equations  in  a limiting  procedure.  This  procedure  entails  first 
using  the  two-scale  equations  to  calculate  the  function  values  at  integer  knots.  Then, 
the  function  values  are  found  at  half-integer  knots,  followed  by  quarter-integer  knots, 
and  so  forth.  In  the  limit,  the  entire  scaling  function  and  wavelet  can  be  derived. 
Daubechies  used  this  procedure  to  design  the  first  set  of  orthonormal,  compactly- 
supported  wavelets  [70].  The  drawback  of  this  approach  is  that  the  scaling  function 
and  wavelet  are  not  obtained  in  an  analytical  form.  Instead,  they  are  fractal  functions 
because  they  are  defined  by  a limiting  procedure. 
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An  alternative  approach  to  wavelet  design  is  to  work  directly  with  analytical 
functions.  These  functions  are  used  to  generate  scaling  functions  and  wavelets  that 
satisfy  the  two-scale  equations  as  well  as  any  additional  properties  that  are  desired. 
In  general,  this  is  not  a trivial  task  in  the  single-generator  case.  However,  it  is  not 
difficult  to  design  multiwavelet  families  using  analytical  functions.  Multiwavelets  are 
comprised  of  a set  of  wavelets  {'ip1, . . . , ipr]  that  are  generated  from  a set  of  r scaling 
functions  {01, . . . , 0r}.  In  recent  years,  Donovan,  Geronimo,  and  Hardin  have  shown 
that  compactly-supported,  orthonormal,  piecewise-polynomial  multiwavelets  can  be 
generated  from  the  functions  that  span  finite  shift-invariant  (FSI)  spaces  using  the 
technique  of  intertwining  multiresolution  analyses  [2,3].  This  design  approach  is  used 
in  Chapter  4 to  construct  multiwavelets  of  arbitrary  approximation  order  from  the 
functions  that  span  the  classical  finite  element  spaces.  As  a final  comment,  it  should 
be  noted  that  a great  deal  of  wavelet  design  is  done  in  the  frequency  domain.  The 
conditions  that  the  filter  coefficients  must  satisfy  can  be  expressed  in  the  frequency 
domain  using  the  Fourier  transform.  The  filters  can  then  be  constructed  in  the 
frequency  domain  using  digital  filter  design  techniques. 

3.2  Wavelets  and  the  Approximation  of  Functions 
In  this  section,  wavelets  are  compared  to  two  engineering  tools  that  are  commonly 
used  for  the  approximation  of  functions,  Fourier  methods  and  the  finite  element 
method.  The  basic  properties  of  Fourier  series  and  finite  element  functions  will  be 
reviewed.  It  will  be  shown  that,  in  some  respects,  wavelets  represent  an  intermediate 
ground  between  these  two  approximation  techniques.  It  will  be  demonstrated  that 
wavelets  are  particularly  effective  in  the  analysis  of  some  types  of  functions  or  signals 
that  are  not  well  represented  by  Fourier  series  or  finite  element  functions.  Therefore, 
wavelets  provide  an  alternative  tool  for  the  approximation  of  a large  class  of  functions 
and  signals  that  are  encountered  in  engineering  practice. 
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3.2.1  Fourier  Analysis 

Fourier  methods  comprise  one  of  the  most  common  tools  for  the  approximation 
of  functions  in  engineering  practice.  They  are  used  extensively  in  signal  process- 
ing applications,  particularly  in  the  analysis  of  periodic  functions  or  signals.  The 
continuous  Fourier  transform  of  an  analog  function  or  signal  / is  given  by 


F(juj) 


f(x)e~ju,xdx 


(3.32) 


The  Fourier  transform  yields  a continuous,  complex-valued  function  F(ju)  in  the 
frequency  domain.  The  original  function  can  be  recovered  using  the  inverse  Fourier 
transform: 


/(x) = h r F^ejujxduj 


(3.33) 


The  complex  exponential  functions  can  be  expanded  using  Euler’s  identity: 


e?“x  = cos (cux)  + j sin(o;x)  (3.34) 

Then,  it  is  clear  that  the  Fourier  transform  expresses  a function  / in  terms  of  its 
projections  onto  sinusoids  of  continuously-varying  frequency.  In  this  manner,  the 
Fourier  transform  gives  the  frequency  spectrum  of  a function  or  signal. 

The  Fourier  transform  in  equation  (3.32)  gives  information  about  the  frequency 
content  of  a function  or  signal  in  terms  of  a continuous  frequency  spectrum.  It  is 
common,  in  practice,  to  evaluate  the  frequency  content  of  signals  in  terms  of  discrete 
frequencies  instead.  This  gives  rise  to  the  complex  Fourier  series,  which  is  used  in 
the  approximation  of  periodic  functions.  The  complex  Fourier  series  expansion  of  a 
function  / 6 Lper[0,L],  where  Lper[0,L\  is  the  space  of  periodic,  square-integrable 
functions  with  period  L,  takes  the  form 
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OO 


/(*)=  S 


cneJ™01 
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(3.35) 


where  {cn}  are  complex  Fourier  coefficients  and  u 0 represents  the  fundamental  fre- 
quency, defined  as  ujq  :=  ^ rad/s.  The  complex  Fourier  series  approximates  a periodic 
function  in  terms  of  its  projections  onto  complex  exponentials  at  discrete  frequencies 
corresponding  to  integer  multiples  of  the  fundamental  frequency  u0.  It  is  not  diffi- 
cult to  show  that  the  complex  Fourier  series  basis  functions  {e’nuoX}nez  satisfy  the 
following  orthogonality  condition: 


(3.36) 


Equation  (3.36),  for  the  case  where  k ^ n,  can  be  verified  as  follows: 


1 j(n—k)unx\L 

j(n  - k)u0 


J(n—k)uiox  | L 
10 


Using  the  fact  that  u>0  = ~ , we  have 


Using  Euler’s  identity,  we  can  write 


ei(n  fc)27r  _ cog  _ ^2tt]  + j sin  [(n  — k)2n]  = 1 


Then,  we  obtain  the  result 


(e 


,jnuj0x  e 


(3.37) 


68 


If  k — n,  we  have 


(eJ 


jnuiox  ejnu> 0x 


'r.U M = £ e>™‘e-’^dx  = £ dx  = L (3.38) 

Hence,  the  functions  {ejnaj°x}  form  an  orthogonal  basis  for  the  space  L£r[ 0,  L\. 

It  is  not  difficult  to  derive  an  expression  for  the  complex  Fourier  coefficients 
{cn}.  This  can  be  accomplished  by  taking  the  Lper[ 0,  L\  inner  product  of  both  sides 
of  equation  (3.35)  with  the  function  ejfctJoX: 


[ f(x)e~jkuJoXdx  = Vc„  [ einwoXe-jkuoxdx 
nez 


Using  equation  (3.36),  this  can  be  reduced  to 


f f(x)e  3kuJ0Xdx  = YcnLS^n 

Jo  nez 


Finally,  we  obtain 


cn  = ~ [ f(x)e~jnuJ0Xdx,  neZ  (3.39) 

T Jo 

Note  that  the  complex  Fourier  series  extends  over  both  the  positive  and  negative 
integers  (n  E Z).  The  coefficients  {cn}  and  {c_n}  form  complex  conjugate  pairs 
since,  using  Euler’s  identity: 

L L L 

°n  = J f(x)e~3nuJoXdx=—  j f(x)  cos(nLU0x)dx  — j-jr  j f(x)  sm(nuJox)dx 
oo  o 

L L L 

C-n  = — J f(x)ejnuJoXdx=—  J f(x)  cos(nuu0x)dx  + J f(x)  sm(nuj0x)dx 


Clearly,  then,  c_„  = cn. 
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Using  Euler’s  identity  once  again,  the  complex  Fourier  series 


f(x)  = ^/cnej™°x  (3.40) 

n€  Z 

can  be  expanded  into  the  form 


f(x)  = {cos(nu>ox)  + j sin(mu0£)} 

neZ 

OO 

= c0  + ^ cn  (cos(no;ox)  + j sin(ncn0x)}  (3.41) 

n=l 
— oo 

+ cn  {cos(no;ox)  + j sin(ncn0^)} 

n=— 1 

Using  the  fact  that  the  coefficients  form  complex  conjugate  pairs,  the  sum  over  the 
negative  integers  can  be  rewritten  as 


— oo  oo 

cn  {cos(no;oa:)  + j sin(na;ox)}  = ^ c_n  {cos(nu;o£)  — j s'm{nuj0x)} 

n=— 1 n= 1 

oo 

= cf  {cos(nu>ox)  — j sin(ntu0x)} 

71=  1 

Then,  equation  (3.41)  takes  the  form 


oo  oo 

/(*)  = Co  + £ {cn  + cf]  cos(nuj0x)  + ^ {cn  - cf}  j sin(no;o^)  (3-42) 

n= 1 n=l 

Defining 


Qq  — Cq 


C^  T C^  2146(0^) 
K = cn-cf=  2Im(c„) 


(3.43) 
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where  Re(cn)  and  Im(cn)  denote  the  real  and  imaginary  parts  of  the  coefficient  cn, 
we  obtain  the  real  Fourier  series: 

OO  OO 

/ (x)  = a0  + ^ an  cos(nuj0x ) + ^ bn  sin(na>oa:)  (3-44) 

n=l  n— 1 

Note  that  the  coefficients  {an}  and  { bn } are  all  real- valued.  The  real  form  of  the 
Fourier  series  is  a commonly  used  alternative  to  the  complex  Fourier  series.  From 
equations  (3.42)  and  (3.43),  the  real  Fourier  coefficients  are  given  by 


a0  — 

I 

[ f(x)dx 

Jo 

(3.45) 

Q"n 

2 

/ f(x)  cos(nuj0x)dx, 
Jo 

n G Z+ 

(3.46) 

bn  = 

2 

/ f(x)  sm(mu0x)dx, 
Jo 

n G Z+ 

(3.47) 

where  Z+  denotes  the  positive  integers.  Note  that  the  Fourier  basis  functions  satisfy 
the  following  orthogonality  properties: 


(1) 

(2) 

(3) 


rAL, 

/ cos{mujQx)  cos(nu0x)dx  = Sm  r 
Jo 

n2L 

/ sm(rriLJ0x)  sin(nu>ox)dx  — <5m>n 
Jo 


2 L 


cos(mcu0x ) sm(nuj0x)dx  = 0 V m,  n G Z+ 


(3.48) 


Therefore,  the  Fourier  basis  functions  form  an  orthogonal  basis  for  Lper(M). 

Fourier  methods  provide  useful  information  about  the  frequency  content  of  func- 
tions or  signals.  One  drawback,  however,  is  that  since  the  Fourier  basis  functions  have 
global  support  (they  are  supported  over  the  entire  real  line),  they  are  not  capable  of 
providing  spatial  information  about  a function.  That  is,  the  Fourier  transform  gives 
the  spectral  content  of  a signal,  but  cannot  give  information  about  where  features 
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occur  in  the  signal  in  the  spatial  (or  time)  domain.  Therefore,  Fourier  methods  are 
not  always  optimal  for  the  analysis  of  time-varying  functions  whose  spectral  content 
changes  with  time.  In  addition,  Fourier  methods  are  not  particularly  efficient  in 
representing  signals  with  sharp  features  or  discontinuities.  This  is  because  the  basis 
functions  have  global  support.  So,  in  representing  a square  pulse,  as  an  example, 
it  takes  many  terms  in  the  Fourier  series  to  average  the  contributions  of  the  basis 
functions  to  zero  outside  of  the  square.  A distinguishing  characteristic  of  a Fourier 
series  approximation  of  a signal  with  sharp  edges  is  an  overshoot  where  these  features 
occur.  Therefore,  Fourier  methods  provide  a valuable  tool  for  analyzing  the  frequency 
content  of  many  signals,  but  also  have  limitations  for  a wide  class  of  signals  that  occur 
in  engineering  practice. 

3.2.2  Finite  Element  Analysis 

In  contrast  to  the  Fourier  transform,  which  gives  the  spectral  content  of  a func- 
tion or  signal,  finite  element  methods  (FEM)  provide  details  about  a function  in  the 
spatial  domain.  In  the  finite  element  method,  the  spatial  domain  is  discretized  into 
a number  of  elements,  and  the  nodes  of  these  elements  represent  degrees  of  freedom 
in  the  system.  Therefore,  a continuous  function,  such  as  the  displacement  of  a beam 
in  bending,  is  approximated  in  terms  of  a number  of  discrete  degrees  of  freedom.  In 
using  this  technique,  the  governing  equation  of  the  system,  typically  a partial  differen- 
tial equation,  is  discretized  into  a matrix  equation  which  is  then  solved  for  the  nodal 
degrees  of  freedom.  Clearly,  the  accuracy  of  such  a solution  depends  on  the  number 
of  elements  used.  More  elements  allow  for  higher  accuracy,  but  also  require  more 
computational  effort  to  generate  a solution.  The  key  to  the  finite  element  method 
is  that  the  basis  functions  have  local  support,  meaning  that  they  are  supported  over 
only  a few  elements.  Thus,  the  resulting  matrix  equation  is  relatively  sparse  and 
block  diagonal  in  structure.  Another  important  feature  of  finite  element  basis  func- 
tions is  that  they  are  interpolatory  at  the  nodes.  This  means  that  at  each  node  one 
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basis  function  has  a value  of  one  while  the  others  have  values  of  zero.  Hence,  in  the 
resulting  approximation  of  the  function,  the  coefficients  are  actually  the  nodal  values. 

A finite  element  approximation  of  a one-dimensional  function  / takes  the  form 

/*(*)  = (3-49) 

i 

where  h denotes  the  discretization  level.  Small  values  of  h denote  finer  meshes  while 
larger  values  of  h correspond  to  coarser  meshes.  The  { dt } are  the  expansion  coef- 
ficients, which  usually  represent  physical  values  (such  as  nodal  displacements)  since 
the  basis  functions  are  interpolatory  at  the  nodes.  The  basis  functions  {Nh,i}  are 
typically  piecewise-polynomial  in  form.  Classical  finite  element  basis  functions  are 
derived  from  Lagrange  polynomials,  which  interpolate  a series  of  equally-spaced  nodes 
on  each  element  [100].  The  number  of  nodes  in  an  element  is  equivalent  to  the  number 
of  basis  functions  over  that  element.  For  example,  classical  linear  basis  functions  are 
obtained  by  applying  Lagrange  interpolation  to  a two-node  element.  The  resulting 
elemental  basis  functions  {iVi,  N2}  are  given  by 


Ni(x)  = 1-|  (3.50) 

N2(x)  = | (3.51) 

where  L is  the  elemental  length.  These  functions  are  shown  in  Figure  (3.2).  When 
several  elements  are  assembled,  we  obtain  the  classical  hat  functions  which  are  used  for 
modeling  the  axial  displacement  of  a rod,  as  an  example.  These  linear  functions  have 
approximation  order  of  one,  meaning  that  they  can  exactly  represent  all  first-order 
(linear)  polynomials.  The  classical  finite  element  functions  can  be  constructed  to 
have  any  desired  approximation  order  by  varying  the  number  of  nodes.  For  example, 
applying  the  Lagrange  interpolation  over  a three-node  element  yields  three  quadratic 
basis  functions  of  approximation  order  two.  It  should  also  be  noted  that  the  finite 
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Figure  3.2:  Linear  Finite  Element  Basis  Functions 

element  basis  functions  are  not  typically  orthogonal.  As  an  example,  one  can  easily 
verify  that  the  functions  Ni  and  N2  in  Figure  (3.2)  are  not  orthogonal  over  the 
element. 

As  stated  before,  the  accuracy  of  the  finite  element  approximation  in  equation 
(3.49)  depends  on  the  mesh  parameter  h.  The  mesh  parameter  can  be  thought  of  as 
defining  a frequency  for  the  basis  functions,  similar  to  the  dilation  index  j for  wavelets. 
Finite  element  bases  generate  approximations  via  nested  spaces.  A fine-scale  space 
Vh  can  be  defined  as 


Vh  :=  span  {Nhik}keZ  (3.52) 

The  space  V^h  is  spanned  by  basis  functions  that  are  supported  over  a mesh  that  is 
twice  as  coarse  as  h.  It  is  not  difficult  to  show  that 


V2  h C Vh 

In  general,  the  nested  spaces  take  the  form 


(3.53) 


E20 h C Vwh  C • • • C Vh 


(3.54) 
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In  the  limit,  the  spaces  {14}  approach  {0}  as  the  mesh  size  approaches  infinity  and 
L2(R)  as  the  mesh  size  approaches  zero.  The  nested  character  of  the  finite  element 
spaces  is  similar  in  form  to  that  for  the  spaces  {Vj}  that  are  encountered  in  a wavelet- 
based  multiresolution  analysis.  The  main  difference  is  that  the  finite  element  basis 
functions  are  not  amenable  to  multiresolution  analysis.  This  is  due  in  large  part  to 
the  fact  that,  unlike  wavelets  and  their  associated  scaling  functions,  there  are  no  con- 
venient relationships  between  the  functions  that  span  the  different  spaces  {14}.  It 
should  be  noted  that  multigrid  methods  are  commonly  used  to  generate  finite  element 
solutions  by  iterating  over  different  mesh  sizes.  This  is  not  the  same  as  multiresolu- 
tion analysis,  however,  which  yields  a multiscale  representation  in  terms  of  localized 
functions  (wavelets)  of  differing  frequencies.  Therefore,  in  general,  finite  element 
methods  are  not  particularly  useful  for  obtaining  information  about  a function  in  the 
frequency  domain. 

3.2.3  Wavelets 

In  previous  sections,  it  was  demonstrated  that  Fourier  methods  provide  valuable 
information  about  the  spectral  content  of  a function  or  signal,  but  are  unable  to  give 
spatial  information  about  where  features  occur.  In  contrast,  finite  element  methods 
can  yield  spatial  information  to  arbitrary  accuracy,  but  do  not  provide  frequency  in- 
formation. Wavelets  are,  in  a sense,  intermediate  to  these  two  common  engineering 
tools.  It  has  been  shown  that  wavelets  and  their  associated  multiresolution  analyses 
provide  information  about  the  frequency  content  of  functions  or  signals.  The  multi- 
level wavelet  basis  is  similar  to  the  Fourier  basis  in  that  it  is  comprised  of  functions 
of  many  different  frequencies.  However,  while  Fourier  basis  functions  have  global 
support,  wavelets  are  compactly  supported.  In  this  respect,  they  are  similar  to  finite 
element  functions  and  can  provide  information  in  the  spatial  domain.  The  compact 
support  of  the  wavelets  enables  efficient  representations  of  signals  with  sharp  edges  or 
discontinuities.  There  is  no  need  to  average  out  the  contributions  of  functions  away 
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from  the  features  as  is  the  case  with  Fourier  series.  For  example,  in  approximating  a 
square  pulse,  all  of  the  wavelet  coefficients  away  from  the  pulse  are  zero,  and  only  a 
few  localized  functions  contribute  to  the  representation.  Therefore,  wavelets  combine 
some  of  the  advantages  of  Fourier  analysis  and  finite  element  methods. 

Similar  to  the  continuous  Fourier  transform,  there  is  a continuous  wavelet  trans- 
form, defined  as 


F(a,  b)  = j f(x)il>  ^ dx  (3-55) 

k 

The  inverse  wavelet  transform  is  given  by 


f(x)  = J J F(a,b)i/> 

R R 

The  continuous  wavelet  transform  projects  a function  onto  a continuous  series  of 
translates  and  dilates  of  a wavelet  0.  The  index  a E R represents  a continuous 
translation  index,  while  the  index  b E R denotes  a continuous  frequency  parameter. 
In  practice,  the  wavelet  transform  is  calculated  over  a series  of  integer  translates  k 
and  discrete  frequencies  2\  j E Z.  This  leads  to  the  discrete  wavelet  transform,  which 
gives  a multiscale  representation  of  a function  or  signal.  As  discussed  in  Section  3.1, 
this  multilevel  expansion  takes  the  form 


x — a 


da  db 


(3.56) 


3-1 

fj(x)  = aio,k<)>jo,k(x ) + PbkAkix)  (3.57) 

k l=jo  k 

where  fj  denotes  a fine-scale  approximation  of  the  original  function  / on  resolution 
level  j.  If  the  wavelets  are  orthonormal,  the  expansion  coefficients  can  be  calculated 
as  the  projections  of  the  function  onto  each  wavelet: 


Pj,k  = 


J f(x)ij>j'k{x)dx 

R 


(3.58) 
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It  will  be  shown  later  that,  in  practice,  the  coefficients  {/3jtk}  are  not  calculated  in  this 
manner  but  rather  from  a recursive  application  of  fast  decomposition  formulas.  Note 
the  similarity  in  form  between  the  Fourier  series  and  the  multilevel  wavelet  expansion 
given  in  equation  (3.57).  If  the  wavelets  are  orthonormal,  the  expansion  coefficients 
can  be  calculated  in  the  same  manner  as  the  Fourier  coefficients. 

It  has  already  been  noted  that  the  nested  spaces  (V)}  that  form  a multiresolution 
analysis  are  similar  to  the  nested  finite  element  spaces  {14}.  A fundamental  difference 
is  that,  with  wavelets,  the  two-scale  relationships  provide  a convenient  means  of 
relating  the  functions  on  different  levels.  This  property  renders  wavelets  amenable 
to  multiresolution  analysis,  while  finite  element  functions  are  not.  In  comparing 
wavelets  and  finite  element  functions,  it  is  interesting  to  note  that  wavelets  also 
can  be  constructed  to  have  arbitrary  approximation  order.  While  for  classical  finite 
element  functions,  approximation  order  is  determined  by  the  number  of  interpolated 
nodes,  for  wavelets  it  is  related  to  the  concept  of  vanishing  moments.  The  moments 
of  a wavelet  are  defined  as 


Mp:= 


(3.59) 


R 

The  number  of  vanishing  moments  is  the  largest  value  of  p for  which  the  moment 
Mp  is  zero.  It  can  be  demonstrated  that  the  approximation  order  of  a wavelet  is 
equivalent  to  its  number  of  vanishing  moments  [99].  If  a wavelet  has  N vanishing 
moments,  then  it  has  approximation  order  N and  the  wavelets  can  exactly  reproduce 
polynomials  of  order  N.  Note  that  the  zero  moment  M0  vanishes  for  all  wavelets, 
meaning  that  wavelets  have  an  average  value  of  zero  over  the  real  line. 

In  conclusion,  wavelets  possess  some  of  the  desirable  properties  of  Fourier  series 
and  finite  element  methods.  An  effort  has  been  made  in  this  section  to  clarify  the 
similarities  and  differences  between  these  three  engineering  tools.  It  is  important  to 
note  that,  while  wavelets  provide  an  alternative  analysis  tool  for  functions  and  signals 
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1 

Figure  3.3:  Haar  Scaling  Function  and  Wavelet 

for  which  Fourier  or  finite  element  methods  are  not  optimal,  there  are  many  cases 
where  wavelets  are  not  necessarily  the  best  choice.  It  is  clear,  however,  that  wavelets 
fill  a unique  niche  in  the  spectrum  of  approximation  tools  in  engineering  practice. 


3.3  The  Haar  Wavelet 

Section  3.1  has  provided  a brief  introduction  to  wavelets  and  multiresolution 
analysis.  This  section  elaborates  upon  some  of  these  ideas,  focusing  on  the  Haar 
wavelet  family.  While  the  Haar  wavelet  is  seldom  used  in  practice  because  it  is  a 
discontinuous  function  with  low  approximation  order,  it  is  worth  studying  because 
the  concepts  presented  in  Section  3.1  are  easily  visualized  for  this  simple  function. 
This  is  not  usually  the  case  for  more  complex  wavelet  families.  Therefore,  the  Haar 
wavelet  serves  as  an  excellent  prototype  for  studying  wavelets.  The  discussion  can 
then  be  extended  to  orthonormal  wavelet  families  and,  more  generally,  orthonormal 
multiwavelets. 

Figure  (3.3)  shows  the  Haar  wavelet  and  scaling  function  defined  on  the  interval 
[0,1].  The  scaling  function  </>  shown  in  Figure  (3.3)  is  the  generator  from  which  the 
wavelet  ^ is  constructed.  The  Haar  scaling  function  and  wavelet  are  defined  as 
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<t>(x) 


\ 1 0 < x < 1 

[ 0 otherwise 

1 0 < x < | 

< -1  \ < x < 1 

0 otherwise 


(3.60) 


(3.61) 


The  level  on  which  the  Haar  scaling  function  and  wavelet  are  defined  is  denoted  as 
level  0.  Recall  that  the  scaling  functions  and  wavelets  on  different  resolution  levels 
are  related  to  the  functions  defined  in  equations  (3.60)  and  (3.61)  by  the  expressions 


<t>j,k{x)  = 2jl2ct>{2  jx-k)  (3.62) 

ipjtk  (x)  = 2j^(2 jx-k)  (3.63) 

where  j £ Z is  the  dilation  index,  k £ 7L  is  the  translation  index,  and  the  2^2  term 
is  a normalization  factor.  Thus,  the  functions  in  equations  (3.62)  and  (3.63)  are 
scaled  dilates  and  translates  of  the  original  scaling  function  and  wavelet.  Note  that, 
evaluating  the  functions  and  using  the  definitions  of  0 and  ip,  we  have 


I 2j '2  2~ik  < x < 2~i(k+  1) 

0 otherwise 

2j/2  2~i k < x < 2~i (k+  |) 

< —2-?/2  2~3 (k  + i)  < x < 2~3 (k  + 1) 

0 otherwise 


(3.64) 


(3.65) 


Several  scaling  functions  and  wavelets  on  different  resolution  levels  are  plotted  in 
Figure  (3.4)  over  the  interval  [0, 1].  For  clarity,  only  every  other  translate  is  shown. 
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Note  that  the  functions  on  the  different  levels  have  different  amplitudes  due  to  the 
2i/2  normalization  factor. 

The  functions  {0^}  and  {ipjtk}  satisfy  the  following  orthogonality  relationships: 


(1)  (0j,fc)  0j,m)  f 0J. A' ('^ 

R 

(2)  (V'j.fc,  0z,m)  = J i>j,k(x)ipiim(x)dx  = 5, 


(3)  = J </>j,k(x)il>i,m(x)dx  = 0 V k,m  6 Z,  j < l 

R 


(3.66) 
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The  notation  5(j,k),(i,m)  is  defined  as 


11  j = l and  k = m 

(3.67) 

0 otherwise 

The  first  orthogonality  property  in  equation  (3.66)  states  that  the  scaling  functions 
on  the  same  level  are  orthonormal.  By  inspection,  it  is  clear  that  these  functions  are 
orthogonal  since  no  two  scaling  functions  on  the  same  level  have  overlapping  support. 
The  orthonormality  of  the  scaling  functions  can  be  verified  by  taking  the  L2(R)  norm 
of  an  arbitrary  scaling  function  <f>jtk: 


UiA\ 


L2(K) 


$j,k) l2(R) 


(3.68) 


Making  use  of  equation  (3.64),  the  inner  product  (< t>j,k,4>j,k ) is  given  by 


= J <t>j,k{x)^j’k{x)dx 

M 

r2~i(k+l) 

= / 2 idx  = 

J 2~ik 

= 2j  [2~j(k  + l)-2~jk] 


= (k  + 1)  — k 
= 1 


Therefore,  it  is  clear  that 


ll<AkfclU2(R)  = 1 

and  the  scaling  functions  {4>jtk}k& z on  a given  level  j form  an  orthonormal  set.  The 
second  orthogonality  property  states  that  two  given  wavelets  are  orthogonal  not  only 
if  they  are  supported  on  the  same  level,  but  also  if  they  are  on  different  levels.  This 
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Figure  3.5:  Haar  Scaling  Functions  on  Adjacent  Resolution  Levels 

can  be  seen  by  inspection,  and  the  orthonormality  of  the  wavelets  can  be  demon- 
strated in  a similar  manner  to  that  used  for  the  scaling  functions.  Finally,  the  third 
orthogonality  property  states  that  the  scaling  functions  are  orthogonal  to  the  wavelets 
that  are  supported  on  the  same  level.  In  addition,  a given  scaling  function  is  orthog- 
onal to  any  wavelet  that  is  supported  on  a finer  grid.  This  property  can  easily  be 
verified  by  inspection. 

Recall  from  Section  3.1  that  there  is  a relationship  between  the  scaling  functions 
that  are  supported  on  one  level  to  those  supported  on  an  adjacent  resolution  level. 
Figure  (3.5)  shows  the  kth  translate  of  a scaling  function  (j)])k  on  an  arbitrary  level  j. 
Also  shown  are  two  scaling  functions,  and  (f>j+lt2k+u  supported  on  level  j + 1,  a 

grid  that  is  twice  as  fine  as  level  j.  The  support  of  the  functions  (frj+i^k  and  (frj+i^k+i 
overlap  that  of  The  translation  indices  of  these  functions  are  2 k and  2k + 1 due 
to  the  fact  that  the  grid  is  now  twice  as  fine.  Note  that  the  scaling  functions  on  level 
j have  an  amplitude  of  2 j/2  while  those  on  level  j + 1 have  an  amplitude  of  2(j+1)/2. 
Therefore,  the  act  of  contracting  a scaling  function  (or  wavelet)  results  in  an  increase 
in  the  amplitude  of  that  function.  From  Figure  (3.5),  it  can  be  seen  that  the  scaling 
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function  cpj  k can  be  expressed  as  a weighted  sum  of  the  scaling  functions  (j)J+i,2k  and 
4>j+\,2k+i  as  follows: 

0j,fc(:C)  = ^/2  0j+1,2A:(;c)  + ^4>j+l,2k+l(x)  (3.69) 

The  factors  in  equation  (3.69)  take  into  account  the  difference  in  amplitude  be- 
tween the  scaling  functions  on  level  j with  those  supported  on  level  j + 1. 

In  a similar  manner,  wavelets  on  level  j can  be  constructed  from  scaling  functions 
that  are  supported  on  level  j + 1.  It  is  not  difficult  to  see  that  the  wavelet  4>j<k  can 
be  formed  (modulo  a constant)  by  piecing  together  a positive  and  a negative  scaling 
function  on  level  j + 1.  The  function  ipjik  can  be  expressed  as  a linear  combination 
of  the  scaling  functions  0j+li2fc  and  0j+li2fc+ i as  follows: 

0j,fc(:E)  = — 750.7+1, 2fc+i  (21)  (3.70) 

Together,  equations  (3.69)  and  (3.70)  comprise  the  generalized  two-scale  equations 
for  the  Haar  scaling  function  and  wavelet.  These  equations  show  that  the  scaling 
functions  and  wavelets  on  level  j can  be  constructed  as  weighted  combinations  of 
the  scaling  functions  that  are  supported  on  the  finer-resolution  level  j + 1.  As  an 
example,  setting  j = 0 and  k = 0 in  equations  (3.69)  and  (3.70),  we  obtain 


0(x) 

= ^0i,o(o:)  + ^01,1(2;) 

= 0(2x)  + 0(  2x  — 1) 

(3.71) 

xp(x) 

= o(z)  - 750i, 1(2;) 

= 0(2x)  — 0(2x  — 1) 

(3.72) 

The  Haar  scaling  function  0 is  the  generator  of  a series  of  nested  spaces  { V)  }j6Z 
that  form  a multiresolution  analysis  for  L2(R).  Each  space  V:j  is  defined  as 
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Figure  3.6:  Decomposition  of  Vj  into  Vj-\  and  W3_\ 


Vj  :=  span  {</>j,k}keZ 

Recall  that  the  space  Vj  can  be  decomposed  as 


(3.73) 


Vj  = Vj-j  © Wj-j 


(3.74) 


where 


Vj~i  .—  span  {cj)j_ i,k} kex 

and  Wj_ i is  the  orthogonal  complement  space,  defined  as 


(3.75) 


Wj- 1 :=  span  {i/>j-i,k}keZ 


(3.76) 
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The  wavelet  space  Wj- 1 represents  the  difference  between  the  two  adjacent  spaces  V3 
and  Vj_ i-  To  visualize  this,  consider  the  function  /,  shown  in  Figure  (3.6),  which  has 
been  approximated  in  terms  of  the  scaling  functions  {(j)j,k}k& z that  span  the  space  V3. 
This  simply  amounts  to  approximating  / using  piecewise  constants  on  a grid  with 
spacing  2~j . Consider  the  circled  region  that  has  been  enlarged  in  the  figure.  The 
function  fj  represents  the  approximation  of  the  function  / in  terms  of  V3 , which  can 
be  written  as  a sum  of  the  functions  fj-X  £ V)_i  and  g3_ i € Wj_x.  Note  that  fj_x  is 
the  average  of  fj  while  g3^x  represents  the  difference  between  fj  and  f3~i-  Equation 
(3.74)  can  be  applied  recursively  to  obtain  the  following  multiscale  decomposition  of 
the  space  Vy 

Vj  = Vj0  © Wjo  © Wh-i  © • • • © W3_2  © Wj-x  (3.77) 

where  jo  is  the  coarsest  level  used  in  the  approximation. 

Recall  that  equation  (3.73)  gives  a single-scale  approximation  of  a function  / £ 
L2(K).  This  representation  takes  the  form 

^ y (3.78) 
fcez 

where  {ayfc}  are  constant  scaling  function  coefficients  and  j is  usually  chosen  to 
correspond  to  a relatively  fine  grid.  For  the  Haar  scaling  function,  fj  is  simply  a 
piecewise-constant  approximation  of  /.  Equation  (3.74)  states  that  an  equivalent 
two-scale  expansion  of  fj  can  be  written  as 

fj(x)  = '52aj-i,k(t)j-iAx)  + J2Pj-i,kipj-i,k(x)  (3.79) 

ke  z fee  z 

where  {atj_ltk}  and  represent  constant  scaling  function  and  wavelet  coeffi- 

cients. Finally,  equation  (3.77)  gives  a multiscale  representation  of  fj  of  the  form 
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3~ 1 

fj(x)  = Y ajo,k<t>jo,k(x)  + Y A.fcVuO)  (3-80) 

fee z i—jo  fee z 

This  multilevel  expansion  is  in  terms  of  the  scaling  functions  on  the  coarsest  level 
jo  and  the  wavelets  that  are  supported  on  levels  jo  through  j — 1.  Recall  that  this 
representation  is  desirable  because  it  enables  both  spatial  and  frequency  resolution 
of  a function  or  signal.  In  addition,  it  is  often  the  case  that  a large  number  of  the 
multiscale  coefficients  are  close  to  zero  and  can  be  neglected.  This  leads  to  a relatively 
sparse  representation  of  the  function. 

As  an  example,  the  Haar  wavelet  transform  can  be  used  to  decompose  the  func- 
tion depicted  in  Figure(3.7).  This  function  is  a .5  Hz  sine  wave  over  the  domain  [0, 1] 
with  a sharp  feature  in  the  form  of  a 32  Hz  sine  component.  Figure  (3.8)  shows  the 
Haar  wavelet  decomposition  of  this  function.  The  original  fine-scale  representation 
of  the  function  is  taken  on  level  8,  which  corresponds  to  an  approximation  in  terms 
of  piecewise  constants  of  width  2-8.  The  coarsest  level  taken  in  the  decomposition  is 
level  0.  Therefore,  we  have  a multilevel  wavelet  representation  of  the  function  in  terms 
of  the  scaling  function  approximation  on  level  0,  which  corresponds  to  the  average 
value  of  the  function  over  [0, 1],  and  the  wavelet  details  on  levels  0 through  7.  Note 
that,  on  the  finer-resolution  levels,  the  wavelet  details  are  nonzero  only  where  the 
sharp  features  of  the  function  occur.  On  the  coarser  scales,  there  are  no  details  from 
these  sharp  features  and  the  wavelet  details  focus  on  the  low  frequency  component 
of  the  signal. 

In  practice,  the  single-scale  approximation  is  the  starting  point  for  a multireso- 
lution analysis.  A fundamental  issue,  then,  is  how  to  calculate  the  coefficients  that 
appear  in  the  multiscale  representation  in  equation  (3.80).  In  this  discussion,  it  is 
assumed  that  the  single-scale  coefficients  {«;,*•}  have  been  determined,  perhaps  using 
one  of  the  techniques  discussed  in  Section  3.1.  It  is  not  difficult  to  derive  decom- 
position formulas  from  which  the  multiscale  coefficients  can  be  calculated  from  the 


86 


Figure  3.7:  Original  Function 

single-scale  coefficients  {aq)fc}.  In  order  to  do  this,  we  first  note  that  the  single-scale 
and  two-scale  representations  of  a function  / are  equivalent.  From  equations  (3.78) 
and  (3.79),  we  have 


X = X + X (3.81) 

fee  z fcez  feez 

Taking  the  L2(R)  inner  product  of  both  sides  of  equation  (3.81)  with  an 

arbitrary  scaling  function  on  level  j — 1 , we  obtain 


X a3,k  / = X ai-l,k  / (t>j-l,k(X)<l>j-l,m(x)dX  (3.82) 

feez  £ fcez  £ 

T ^ ' Pj— i,fe  I ^Pj—i,k(x')(j)j-.itrn(x')dx 

Recall  that,  from  the  orthogonality  properties  of  the  Haar  scaling  function  and 
wavelet,  the  integrals  on  the  right-hand  side  of  equation  (3.82)  can  be  simplified 


as 
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LEVEL  7 WAVELET  DETAILS 

o 

UJY. 

PWv 

8 8=5 8=36 8=3 8=8 8=5 8=8 0=7 o=8 o.» 1 

Figure  3.8:  Haar  Wavelet  Decomposition 
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J 0j— \,k{x)4*j  — \,mix)dx  — dk,m 

R 

J xljj-\yk{x)(f)j-i^rn{x)dx  = 0 V k,  m G Z 

R 

Then,  equation  (3.82)  reduces  to 


^ ' Qj—\,k^k,m  ^ ' Q!j,fe  I &j,k(x')&j—l,rn(p'')dx 

k fcez  Jj 

Substituting  for  4>j-\,m  using  the  two-scale  relationship,  equation  (3.69),  we  obtain 


aj-l,n 


Qfj.fc  j ^j,k{x)  N/2  0j,2m(^)  + v/2  0j,2m+l(2;) 

fcez  e 


dx 


Using  the  orthogonality  of  the  scaling  functions  once  again,  this  can  be  written  as 


a 


6-1, m — 


/ ^ ^jik  ,/9  {^k,2m  T $fc,2m+l) 


k 

Finally,  the  following  expression  for  Oj_ lm  is  obtained: 

— l,m  ^2  (*U,2m  T dj,2m+l)  (3.83) 

In  a similar  manner,  taking  the  inner  product  of  both  sides  of  equation  (3.81)  with 
we  have 


y:  Qj.fc  / (/>j,k(x)^j-i,m(x)dx  = y aj-i,fc  / <t>j-\,k{x)^j-\,m{x)dx 

TQ)  AlGZ  IQ) 


+ y^/3j-l,fc  / 1pj-l,k(x)lpj-l,m(x)dx 


fcez 


which,  by  orthogonality,  reduces  to 


89 


P 


(j— l,m  ^ ] ®j,k  I P j ,k{~P)^P j dx 
fee  z i 


Then,  substituting  for  t/'j-i.m  using  the  two-scale  equation,  equation  (3.70),  we  obtain 


Pi 


(7— l,m  — ^ ] Qj,fe  / Pj,k{pP)  v/20j,2m(‘^')  ^Pj,2m+ 1 

feez  „ 


dx 


Once  again  making  use  of  the  orthogonality  of  the  scaling  functions,  this  can  be 
simplified  to 


Pj—l,m  / ] & j,k  ,/9  (^fe,2 m ^fe,2m+l) 

k 

Finally,  the  following  expression  for  Pj-iiTn  is  obtained: 

Pj— l,m  = ^/2  (^J,2m  Cfj,2m-l-l)  (3.84) 

Equations  (3.83)  and  (3.84)  are  decomposition  formulas  whereby  the  coefficients 
and  can  be  calculated  from  the  single-scale  coefficients  {ay 7,}  in 

a fast,  efficient  manner.  Clearly,  the  coefficients  {ctj- 2y}  and  {/3y_2>fc}  can  then  be 
determined  from  {aj-iy}.  Continuing  in  this  manner,  the  entire  array  of  multilevel 
coefficients  (the  discrete  wavelet  transform)  can  be  calculated  through  a recursive 
application  of  the  decomposition  formulas.  Hence,  generating  the  coefficients  in  the 
multiscale  representation  of  a function  is  a recursive,  fast-filter  operation  implemented 
over  all  the  levels  that  are  included  in  the  decomposition.  It  is  convenient  to  express 
the  results  in  equations  (3.83)  and  (3.84)  in  matrix  form: 
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/ 

\ 

1 

y/2 

1 

n/2 
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0 

0 

> = 

0 
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i aj-l,m+2 
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aj,2m+2 

< 

y 

< 

y 

> 

f 

\ 

Pj  — l,m 

1 

V2 

1 

72 

0 

0 

0 

0 

&j,2m 

Pj  — l,m+l 

> = 

0 

0 

1 

V2 

1 

V2 

0 

0 

< 

aj,2m+l 

Pj  — l,m+2 

0 

0 

0 

0 

1 

x/2 

1 

V2 

&j,2m+2 

< 

< 

y 

(3.86) 

The  block  diagonal  matrices  that  appear  in  equations  (3.85)  and  (3.86)  are  referred 
to  as  decomposition  filter  matrices.  For  notational  simplicity,  let  the  decomposition 
filter  matrices  in  equations  (3.85)  and  (3.86)  be  denoted  as  [Aj-ij]  and  [Bj- ij], 
respectively,  where  the  subscripts  j — 1 and  j indicate  between  which  two  levels  the 
decomposition  is  taking  place.  Introducing  the  vector  notation: 


\ 

f \ 

&j}2m 

&j — 1 ,m 

^j,2m+l 

> QLj-i  :=  < 

— 1 ,m+ 1 

II 

7 

Pj  — l,m+l 

l,m+2 

Pj—l,m+2 

s.  y 

•.  ’ > 

equations  (3.85)  and  (3.86)  can  be  written  in  the  consolidated  form 


Q-j- 1 — 


(3.87) 
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(3.88) 


The  vector  represents  the  scaling  function  coefficients  on  level  j,  and  /3._[  are 
vectors  composed  of  the  scaling  function  and  wavelet  coefficients  on  level  j — 1.  Note 


[Bj-ij],  wavelet  decomposition  is  computationally  very  efficient.  In  the  next  section, 
it  will  be  shown  that  the  decomposition  filter  matrices  for  all  orthonormal  wavelet 
families  take  the  same  general  form  as  those  for  the  Haar  wavelet. 

The  inverse  of  the  decomposition  operation  is  known  as  reconstruction.  It  is  not 
difficult  to  derive  a reconstruction  formula  whereby  the  finer-scale  coefficients  {ayy} 
can  be  calculated  from  the  coarser-scale  coefficients  {ay-i,*,}  and  {Pj-i, jt}.  This  is 
accomplished  by  first  taking  the  inner  product  of  both  sides  of  equation  (3.81)  with 
the  scaling  function  </>ym: 


Substituting  for  and  using  the  two-scale  equations,  we  obtain 


that,  since  level  j — 1 is  twice  as  coarse  as  level  j,  the  vectors  aJ_1  and  /?  contain 
half  the  number  of  elements  as  ay  Because  of  the  block  diagonal  form  of  [A,_i  j]  and 


By  orthogonality,  this  reduces  to 
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+ - ^0j,2fc+l(x)  (t)j,m{x)dx 

kez 

Finally,  by  the  orthogonality  of  the  scaling  functions,  the  following  reconstruction 
formula  is  obtained: 


aj,m  y!  01  j- l.fe  (^^2*:,™  + v/2(^2fc+1>m)  Pj-l,k  (^2^ 2k, m ~ -^^2k+l,m^  (3.89) 

fe  A: 

It  is  convenient  to  write  the  preceding  reconstruction  formula  in  matrix  form: 


> 

73  0 

Qj.m 

> = 

A 0 

< 

1,/c 

o o 

Sab  Ss,b 

k > 

^ > 

i 

0 

/ 

72 

1 

72 

0 

< 

Pj—l,k 

0 

1 

72 

Pj—l,k+l 

0 

l 

72 

. > 

(3.90) 


Using  our  vector  notation  and  noting  that  the  matrices  that  premultiply  the  vectors 
%_ i and  /?  are  the  transposes  of  and  j]  respectively,  equation  (3.90) 

can  be  written  as 


% - + [iVljf  /t-i 


(3.91) 
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The  decomposition  filter  matrices  [Aj-ij]  and  [Bj- ij]  are  orthogonal  transformations, 
meaning  that  they  satisfy 

[VdhVu]  = [/]  (3*92) 

[Bj.tf  = [/]  (3.93) 

where  [/]  denotes  the  identity  matrix. 

Collectively,  the  decomposition  and  reconstruction  formulas  given  by  equations 
(3.83),  (3.84),  and  (3.89)  provide  a fast,  efficient  means  of  calculating  the  scaling 
function  and  wavelet  coefficients  on  one  level  from  the  scaling  function  coefficients  on 
a level  twice  as  fine,  and  vice  versa.  These  equations  are  applied  recursively  to  relate 
coefficients  from  nonadjacent  levels,  such  as  calculating  the  multiscale  coefficients 
from  the  single-scale  coefficients.  While  the  results  given  in  this  section  have  been 
developed  specifically  for  the  Haar  wavelet,  it  will  be  shown  in  the  following  section 
that  they  generalize  to  all  orthonormal  wavelet  families. 

3.4  Orthonormal  Wavelets 

In  the  preceding  section,  the  two-scale  equations  were  developed  for  the  Haar 
scaling  function  and  wavelet  as  a means  of  relating  the  scaling  functions  and  wavelets 
on  one  level  with  those  on  an  adjacent  level.  The  multiresolution  analysis  that  is 
generated  by  the  Haar  scaling  function  was  discussed  in  detail.  In  particular,  de- 
composition and  reconstruction  formulas  were  derived  in  order  to  relate  the  scaling 
function  and  wavelet  coefficients  on  different  levels.  While  derived  specifically  for  the 
Haar  wavelet  family,  these  results  can  be  generalized  to  all  orthonormal  wavelet  fam- 
ilies. That  is,  the  foregoing  results  are  applicable  to  all  scaling  function  and  wavelet 
pairs  (0,  0)  satisfying  the  following  orthonormality  conditions: 
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(1) 

(&j,ki  $j,m) 

^ $j,k(1X)4>j,Tn{.x')dx  — 

R 

(2) 

Tpl, m) 

= J lpj,k(X)lPl,m{x)dx  = 

(3.94) 


R 

(3)  = j 4>jlk(^)'tPl,m{x)dx  = 0 Vfc.mGZ,  j<; 

R 

Clearly,  the  Haar  wavelet  is  a specific  example  of  an  orthonormal  wavelet. 

Recall  that  the  two-scale  equations  for  the  Haar  scaling  function  and  wavelet  are 
given  by 


— ^(t)j+ l,2k{x)  + -^(f)j+it2k+l(x)  (3.95) 

'tPjfii.x)  = ^J2(t)j+l,2k{.x)  — -^(f)j+i}2k+l(x)  (3.96) 

The  two  factors  that  appear  in  equation  (3.95)  and  the  and  — ^ factors  that 
appear  in  equation  (3.96)  are  termed  scaling  function  and  wavelet  filters,  respectively. 
If  we  define 


{as}  {ao,  °i}  ~ v^} 

{bs}  = {b0,bi}  = 


where  {as}  represents  the  scaling  function  filters  and  {bs}  represents  the  wavelet 
filters,  equations  (3.95)  and  (3.96)  can  be  written  in  the  form 


4*j,k (x)  ^ ^ &s0j+l,2k+s(x)  (3.97) 

s 

^j,k{x)  ^ ^ bs(f)j^.\  2k+sijx) 


s 


(3.98) 
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In  general,  every  scaling  function  and  wavelet  pair  has  a unique  set  of  filters  {as}  and 
{bs}.  The  number  and  range  of  filter  coefficients  in  {as}  and  {bs}  differs  for  each  pair 
and  depends  on  the  support  of  the  functions  0 and  ip.  The  two-scale  relationships  for 
all  orthonormal  wavelet  families  take  the  form  shown  in  equations  (3.97)  and  (3.98). 

Multiresolution  analyses  are  generated  from  orthonormal  scaling  functions  in 
exactly  the  same  manner  as  the  Haar  scaling  function.  For  a given  orthonormal 
scaling  function  and  wavelet  pair  (0,0),  each  space  Vj  is  defined  as 

Vj  :=  span{(pjik}keZ  (3.99) 

where 

(pj,k(x)  = 2 j/2cp  (2jx  - k)  (3.100) 

Each  wavelet  complement  space  Wj  is  defined  as 

Wj  :=  span{ipj}k}keZ  (3.101) 

where 

iPjik(x)  = 2^0  (2jx  - k)  (3.102) 

Recall  that  the  space  Vj  can  be  decomposed  in  terms  of  the  coarser-scale  spaces  V)_i 
and  Wj-\\ 


Vj  = Vj_l®Wj^  (3.103) 

Applying  equation  (3.103)  recursively,  the  following  multilevel  decomposition  of  Vj  is 
obtained: 

Vj  = Vjo  © Wj0  © Wj0-!  © • • • © Wj- 2 © Wj-x 


(3.104) 
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Recall  that  the  starting  point  of  a multiresolution  analysis  of  a function  or  signal  / 
is  a single-scale  approximation  of  the  form 


fj(x)  = (3.105) 

kez 

which  is  in  terms  of  the  scaling  functions  that  span  a fine-scale  space  Vj.  Equation 
(3.103)  leads  to  an  equivalent,  two-scale  expansion: 


fj  (x)  = ^2aj-i,kPj-iAx)  + ^2Pj-i,ktpj-i,k(x)  (3.106) 

Finally,  equation  (3.104)  implies  that  we  have  a multilevel  representation  of  fj  of  the 
form 


i- 1 

fj(x ) = Yla3o,k<l>jo,k(x)  + (3.107) 

feg Z l=jo  k(E Z 

which  is  in  terms  of  the  wavelets  supported  on  levels  jQ  through  j — 1 and  the  scaling 
functions  on  only  the  coarsest  level  jo- 

Recall  that,  in  the  previous  section,  decomposition  formulas  were  developed  for 
the  Haar  wavelet.  Applied  recursively,  these  equations  provide  a fast,  efficient  tool  for 
calculating  the  coefficients  that  appear  in  the  multilevel  representation  of  a function. 
A reconstruction  formula,  which  performs  the  inverse  operation  to  decomposition, 
was  also  derived.  The  derivation  of  generalized  decomposition  and  reconstruction 
formulas  for  orthonormal  wavelet  families  follows  very  closely  from  the  development 
for  the  Haar  wavelet.  In  order  to  derive  these  formulas  for  a given  orthonormal 
scaling  function  and  wavelet  pair  (</>,  ip),  we  first  equate  the  single-scale  and  two-scale 
representations  of  a function  /.  From  equations  (3.105)  and  (3.106),  we  have 

^ 1 ^ 'JOLj—\,k$j—\,k{x)  T ^ ^ Pj— l,fc(^) 

fcez  fee z fcez 


(3.108) 
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A decomposition  formula  for  the  coefficients  {«j_ i^}  can  be  obtained  by  taking  the 
L2(R)  inner  product  of  both  sides  of  equation  (3.108)  with  the  scaling  function  itTn: 


^ ^ O'y/c  I l,m(jE)dx  ^ — l,fc  I 4*j— 

k R fc  K 

k r 

Using  the  orthogonality  of  the  scaling  functions  and  wavelets,  this  reduces  to 

ke  z 

Substituting  for  using  the  two-scale  equation,  equation  (3.97),  we  have 

l,m  = ^ y ] &s4*j,2m+s(p^4*j,k(j£)dx 

k R s 

By  orthogonality,  this  can  be  simplified  to 


^j— l,m 


^ ^ ®j,k@'sdk 
k,s 


,2m+s 


Finally,  the  following  decomposition  formula  for  is  obtained: 


l,m  ^ ' CXj,kak-2m  (3.109) 

k 

Similarly,  a decomposition  formula  for  (3j-i,m  is  given  by 

Pj—l,m  = y ] Otj,kbk—2m  (3.110) 

k 

Equation  (3.110)  can  be  derived  by  taking  the  inner  product  of  both  sides  of  equation 
(3.108)  with  an  arbitrary  wavelet  on  level  j — 1.  The  two-scale  relationship 


98 


in  equation  (3.98)  and  the  orthogonality  properties  are  then  used  to  simplify  the 
resulting  expression  to  the  form  shown  in  equation  (3.110). 

It  is  convenient  to  express  the  results  from  equations  (3.109)  and  (3.110)  in 
matrix  form: 


/ 

\ 

Oij— l,fc 

ao 

a\ 

«2 

«3 

0 

0 

&j,2k 

Aj— i,fc+i 

> = 

0 

0 

a0 

Ol 

«2 

03 

< 

®j,2k+l 

aj-l,k+2 

0 

0 

0 

0 

ao 

Ol 

®j,2k+2 

y 

< 

y 

(3.111) 


\ 

/ 

\ 

Pj-l,k 

bo 

bi 

^2 

b3 

0 

0 

°lj,2k 

Pj-l,k+l 

> = 

0 

0 

bo 

b\ 

^2 

b 3 

< 

aj,2k+l 

Pj- l,fc+2 

0 

0 

0 

0 

bo 

61 

aj,2k+2 

y 

< 

y 

(3.112) 


Once  again,  note  that  the  number  of  filter  coefficients  and  the  range  of  indices  is 
dependent  on  the  support  of  the  specific  scaling  function  and  wavelet  pair  being 
used.  As  an  example,  equations  (3.111)  and  (3.112)  show  the  case  where  there  are 
four  scaling  function  and  wavelet  filter  coefficients  with  indices  ranging  from  0 to 
3.  Following  the  same  notation  as  before,  let  the  decomposition  filter  matrices  in 
equations  (3.111)  and  (3.112)  be  denoted  as  [Aj- ij]  and  respectively.  Using 

vector  notation,  equations  (3.111)  and  (3.112)  can  be  written  as 


Qj- 1 — [A?-ip]  A? 


(3.113) 
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where  represents  a vector  of  the  scaling  function  coefficients  on  level  j and  a -_ x, 
(3.  are  vectors  representing  the  scaling  function  and  wavelet  coefficients  on  level 
3 ~ 1- 

In  general,  a multiresolution  analysis  is  carried  out  over  a number  of  levels.  For 
example,  suppose  a function  is  to  be  decomposed  over  two  levels.  It  is  known,  from 
equations  (3.113)  and  (3.114),  that 

Qj- 1 = —j 

'—j-i  ~ Qij 

Similarly,  we  can  write 


— j— 2 - [Aj-2,j-l]  QLj- 1 
@_j_2  = «j-l 

Often,  and  /?  are  combined  into  a stacked  vector  so  that  equations  (3.113) 
and  (3.114)  take  the  form 


Since  cr_,  and  B.  „ 

— J ^ —J-2 


\ 

— j-1 

1 

7 

i 

— 

II 

i 

i 

i 

. ft-.  . 

i 

i 

Co. 

are  calculated  from  &j_x  and  are  not  dependent  on  /?  , 


(3.115) 
we  can 


write 
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\ 

— j— 2 

7 

1 

1 

o 

/ 

— j— 1 

— 

II 

1 

1 

1 

1 

1 

1 

< 

1 

1 

1 

, —.7—2  , 

[^7-2  J-l] 

1 

o 

1 

\ — J 1 

(3.116) 


Combining  equations  (3.115)  and  (3.116),  the  following  expression  is  obtained: 


g-pMjla*  (3-117) 

where  the  vector  of  multiscale  coefficients  (3  and  the  transformation  matrix  [T)_2j] 
are  defined  as 


a 


■j- 2 


3m 


/?:=  { 


- > 


(3.118) 


Wi- 2,j\  ■■= 


[A?- 2,7  — l] 

— 

0 

[Ai-l.j] 

[-®7-2,7'-i] 



— 

0 

(3.119) 


Recall  that  in  a wavelet-based  multiresolution  analysis,  a function  is  expressed  as 
a linear  combination  of  wavelets  over  all  the  levels  used  in  the  decomposition  and 
the  scaling  functions  on  only  the  coarsest  level.  In  the  preceding  example,  where 
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the  function  is  decomposed  over  two  levels,  the  vector  of  multiscale  coefficients  is 


a_j_ 2 on  the  coarsest  level  alone.  Equation  (3.117)  shows  in  matrix  form  how  the 
multiscale  coefficients  are  obtained  from  the  single-scale  coefficients  for  the  example 
we  have  considered.  This  result  can  easily  be  extended  to  decompositions  over  more 
than  two  levels.  It  should  be  noted  that,  in  practice,  one  does  not  compute  the 
wavelet  transform  by  explicitly  forming  a decomposition  matrix  such  as  the  one  shown 
in  equation  (3.117).  Instead,  the  decomposition  formulas  given  by  equations  (3.109) 
and  (3.110)  are  applied  recursively.  The  matrix  form  has  been  given  here  to  provide 
additional  insight  into  the  structure  of  decomposition  over  several  levels. 

A reconstruction  formula,  applicable  to  all  orthonormal  wavelet  families,  can  be 
derived  by  taking  the  inner  product  of  equation  (3.108)  with  the  scaling  function  0ym: 


Substituting  for  (j>j-i,k  and  Vb-i.fc  i*1  terms  of  the  two-scale  equations,  we  obtain 


composed  of  wavelet  coefficients  on  both  levels,  /?  and  /?  , and  scaling  coefficients 


By  orthogonality,  this  reduces  to 
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Making  use  of  the  orthogonality  of  the  scaling  functions,  this  can  be  simplified  to  the 
form 


a,- 


7,772  E l,k^s^2k+s  ,772  + Eft-  l,k^s^2k+s,m 
k,s  k,s 

Finally,  the  following  reconstruction  formula  for  is  obtained: 


a 


i,m  ^ ' ^j—l,k^m—2k  "F  ^ ] Pj—X,kbm—2k 


(3.120) 


It  is  convenient  to  write  equation  (3.120)  in  matrix  form: 


\ 

s 

a0 

0 

ai 

0 

a],k 

a-2 

do 

> = 

< 

&j,k+ 1 

0-3 

a 1 

aj-X,k+X 

0 

a2 

0 

^3 

> 

_ 

-1 

b0  0 
&!  0 
b2  b0 

63  bi 

0 b2 

0 


Pj-X,k 

Pj—X,k+X 


(3.121) 
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Just  as  in  the  Haar  case,  the  matrices  in  the  reconstruction  formula  are  the  transposes 
of  the  decomposition  filter  matrices  [Aj_ij\  and  \Bj-\J\,  and  equation  (3.121)  can  be 
written  as 

££,•  - (Vi/%-1  + (3.122) 

In  general,  wavelet  reconstruction  is  carried  out  over  multiple  levels.  Consider  an 
example  in  which  the  reconstruction  is  taking  place  over  two  levels,  starting  at  level 
j — 2.  This  implies  that  the  vectors  of  coefficients  /?  , /T_2,  and  are  all  known. 

The  goal  of  the  reconstruction,  then,  is  to  calculate  the  single-scale  coefficients  a.y  It 
is  clear  from  equation  (3.123)  that  we  can  write 

QLj- 1 = [Aj-2,j-i]T  QLj-2  + \Bj-2,j-i]T  (3.123) 

Equations  (3.123)  and  (3.121)  can  be  written  in  the  form 


/ \ 

QLj-2 

1 — 

< 

— 

, — j— 2 , 

- 

JP 

II 

[Aj-i,j]T 

[B,- 

< 

—j- 1 
— 

. ft-.  , 

Combining  equations  (3.124)  and  (3.125),  we  obtain 


(3.124) 


(3.125) 


QLj  = [Tj,j- a]£ 


(3.126) 


where  (5  is  the  multiscale  vector  defined  in  equation  (3.118)  and  [Tjj-2]  is  defined  as 
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[A?— 2,j— i]t 

0 

Py-2]  := 

[Aj- i,j]T 

— 

— 

— 

0 

0 

[I] 

(3.127) 


Note  that,  comparing  equation  (3.127)  with  equation  (3.119),  we  have 


Kj- 2]  = K-yf  (3.128) 

That  is,  [Tj_2j],  and  the  wavelet  transform  in  general,  is  an  orthogonal  transforma- 
tion. 


3.5  Orthonormal  Multiwavelets 

In  the  previous  section,  the  Haar  wavelet  discussion  was  extended  to  include 
all  orthonormal  wavelet  families.  In  this  section,  the  discussion  will  be  generalized 
further  to  consider  multiresolution  analyses  that  are  generated  by  several  scaling  func- 
tions {(j)1, . . . , 0r},  where  r is  the  number  of  generators.  The  r associated  wavelets 
{■01, . . . , ipr}  are  termed  multiwavelets.  The  case  where  r = 1,  which  has  already  been 
discussed  in  detail,  is  known  as  the  single- generator  case.  The  motivation  for  mul- 
tiwavelets is  that  using  several  functions  enables  more  freedom  in  designing  wavelet 
families.  For  example,  it  is  possible  to  construct  piecewise-polynomial  multiwavelets 
that  are  orthonormal,  compactly-supported,  and  have  arbitrary  approximation  or- 
der. In  addition,  one  can  obtain  analytical  forms  for  many  multiwavelets.  In  the 
next  chapter,  multiwavelets  that  satisfy  these  properties  will  be  constructed  from  the 
classical  finite  element  basis  functions.  It  will  be  demonstrated  that  this  particular 
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class  of  multiwavelets  can  easily  be  adapted  to  boundaries.  This  section  focuses  on 
developing  general  results  for  orthonormal  multiwavelets  and  their  associated  mul- 
tiresolution  analyses. 

Given  r scaling  functions  {01, . . . , cpr}  and  r associated  wavelets  {ip1, . . . , ipr}, 
the  following  vectors  can  be  defined: 


\ 

01 

/ 

Ip1 

02 

ip2 

II 

*1 

lpr 

\ 

Orthonormal  multiwavelets  satisfy  the  following  orthogonality  conditions: 


(3.129) 


(1) 

/< 

k(®)$,  m 

[x)dx  - 

~ 8(k,s),(m,t) 

R 

(2) 

/« 

kiX)'tPl,rr, 

Xx)dx  = 

= 8(j,k,s),(l,m,t) 

(3.130) 

E 

(3) 

/* 

k(X)^,m 

[x)dx  - 

- 0 Vf,mGZ,  s,fe{l,.. 

VI 

'T 

E 

where  we  have  the  notation 


cj)s.  k ■-  2j/2(t>s  (2jx-k)  (3.131) 

ipajik  :=  2J/V  {23x  - k)  (3.132) 

Using  the  vector  notation  in  equation  (3.129),  the  orthogonality  properties  in  equation 
(3.130)  can  be  written  in  the  equivalent  form 

(!)  I <kJ,k(x)(tlm(x)dx  = [J]  8k, m 


E 
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(2)  f%,t(x)^m(x)dx  = [I\5mM,m)  (3.133) 

H 

(3)  J $jtk(x)'&ftm(x)dx  = [0]  V k,meZ,  j <1 

E 

(4)  J Vjtk(x)tiftm{x)dx  = [0]  V k, me  Z,  j>l 
u 

where  [/]  represents  an  r x r identity  matrix  and  [0]  denotes  an  r x r matrix  of  zeros. 
The  two-scale  equations  for  multiwavelet  families  are  given  by 


k(x)  = aP  ^izh+pi*)  (3.134) 

p,t 

q-u(z)  = J2bv%2nrM  (3135) 

p,t 

In  vector  form,  the  two-scale  equations  can  be  written  as 


$j-i,fc(z)  = K]  ®j,2k+P(x) 

P 

(3.136) 

*j- lAX)  = ^ibp]^j,2k+p{X) 

(3.137) 

p 


Note  that  these  two-scale  equations  take  the  same  form  as  in  the  single-generator 
case.  The  main  difference  is  that  the  scaling  function  and  wavelet  filters  {[ap]},  { [bp\ } 
are  matrices  instead  of  scalars. 

A multiresolution  analysis  that  is  generated  by  r scaling  functions  {01, . . . ,0r} 
is  termed  a multiresolution  analysis  of  multiplicity  r.  A multiresolution  analysis 
of  multiplicity  r is  a family  of  subspaces  {Vj}-eZ  of  L2( R)  satisfying  the  following 
properties  [2]: 


(1) 


• • • v-2  c V-!  c Vo  c Vi  c y2  • • • 
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(2) 

3 

II 

j&Z 

(3) 

= (R) 

(3.138) 

jez 

(4) 

f (x)  eVj4=^f  (2x)  e Vj+l 

(5) 

f(x)eVj=>f(x-k)€Vj 

V keZ 

(6) 

There  are  r functions  {01, . . . 

. such  that  the  set  of  translates 

{4>s(x  — k)  : s = 1, . . . , r}keZ  forms  a Riesz  basis  for  Vq 

The  properties  of  a multiresolution  analysis  that  is  generated  by  multiple  scaling 
functions  are  essentially  the  same  as  those  for  the  single-generator  case.  The  only 
difference  is  in  the  number  of  generators.  The  space  Vo  is  defined  as 

V0  :=  span{(f)s(-  - k)  : s = 1, . . . , r}keZ  (3.139) 

and,  in  general,  the  space  Vj  is  defined  as 

Vj  :=  span  {(j)sjk  : s = 1, . . . , r}k&  (3.140) 

Similarly,  the  wavelet  space  Wj  that  comprises  the  difference  between  two  adjacent 
spaces  Vj+i  and  Vf 


W,=vj+lev,  (3.141) 

is  defined  as 

Wj  :=  span  : s = 1, . . . , r}fcgZ  (3.142) 

The  space  Wo  is  defined  as 


Wo  :=  span  {tps(-  — k)  : s = 1, . . . , r}fcgZ 


(3.143) 
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Just  as  before,  we  have  the  following  multiscale  decomposition  of  Vf 

Vj  = Vj0  ® Wjo  © Wfr-x  © • • • © Wj- 2 © Wj-i  (3.144) 

where  j0  is  the  coarsest  level  in  the  decomposition. 

Similar  to  the  single- generator  case,  we  have  a single-scale  approximation  of  a 
function  / of  the  form 


= <3-145) 

s=l  fee Z 

and  an  equivalent,  two- scale  expansion  that  can  be  written  as 


Mx)  = EE“h,*',>w <3-146) 

s=i  fee z s=i  fee z 

A multilevel  representation  of  fj  corresponding  to  equation  (3.144)  is  given  by 


m = EE  + E E E fwu*)  <3-147) 

s=i  feez  i=jo  s=i  feez 

Note  that  the  expressions  for  fj  in  equations  (3.145)  through  (3.147)  are  of  the  same 
form  as  the  expansions  that  are  given  in  the  single-generator  case.  The  only  difference 
is  that,  for  multiwavelets,  it  is  necessary  to  sum  over  the  number  of  different  functions 
as  well.  Of  course,  when  r = 1,  equations  (3.145)  through  (3.147)  reduce  to  the  single- 
generator results.  Introducing  the  vector  notation 


a 


j,k 


—j,k  • < 


a 


j,k 


(3.148) 


the  single-scale  approximation  in  equation  (3.145)  can  be  written  as 


fj(x)  = X^fe^'-fe^) 

fee  z 


(3.149) 
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and  the  two-scale  expansion  in  equation  (3.146)  takes  the  form 

AM  = + (3150) 

fee z fcez 

The  derivation  of  decomposition  formulas  which  can  be  used  recursively  to  cal- 
culate the  multiscale  coefficients  in  equation  (3.147)  follows  the  same  procedure  as 
the  single-generator  case.  It  is  convenient  to  work  with  the  vector  form  of  the  equa- 
tions, although,  as  will  be  seen  later,  the  vector  notation  is  not  easy  to  generalize  to 
higher  dimensions.  The  first  step  in  deriving  multiwavelet  decomposition  formulas  is 
to  express  the  equivalence  of  the  single-scale  and  two-scale  expansions  of  a function: 


hk(x) 


(3.151) 


fee  z 


fee  z 


fee  z 


Post-multiplying  both  sides  of  equation  (3.151)  by  and  integrating  over  R,  we 

have 


fc  R fc  R 

fc  k 

By  orthogonality,  this  reduces  to 


fc  ; 


$jtk(x)$J_lTn(x)dx  = [/]  SktTl 


and 


T 

cv 

-—J  — 1,771 


Transposing  the  two-scale  relationship,  equation  (3.136),  we  obtain  the  following 
expression  for 
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$J-1  ,m(X)  = NT 

P 

Then,  substituting  for  $J_l  rn,  we  have 

= '52Zj,k  f ®j,k(x)  (®l2m+p(X)  [QP]T)  dx 

k,p  k 

Once  again  using  the  orthogonality  properties  of  the  multiwavelets,  this  becomes 

—j—l,m  = %,fe  [^]  dk,2m+p  [ap] 
k,p 

Finally,  this  can  be  simplified  to 

K-2m]T 

k 

Taking  the  transpose,  the  following  decomposition  formula  is  obtained: 

—j—l,m  — [ak-2m]  QLjtk  (3.152) 

k 

Following  a similar  procedure,  the  following  decomposition  formula  for  the  vectors  of 
wavelet  coefficients  {/?  m}mez  can  be  derived: 

ii-i  = <3153) 

fc 

Equation  (3.153)  is  obtained  by  post-multiplying  equation  (3.151)  by  and 

integrating  over  R.  The  resulting  expression  is  then  simplified  by  using  the  orthogo- 
nality properties  of  the  multiwavelets  and  the  two-scale  relationship  given  in  equation 
(3.136).  Note  that  the  decomposition  formulas  can  also  be  written  in  the  form 

r 

^ v ^k—2m^j}k 
t= 1 k 


(3.154) 


Ill 


= (3-155) 

t= 1 k 

A reconstruction  formula  can  be  derived  for  orthonormal  multiwavelet  families 
by  post-multiplying  equation  (3.151)  by  and  integrating  over  R: 


E^  ®jAX)®lm(X)dx  = / ®j-lAX)$-lm(X)dx 

h.  J V.  J 

K R K R 

+ Yl(S-l,kf  ^-j-lAX)^lm(X)d' 

k R 

Using  the  orthogonality  properties  of  the  multiwavelets,  we  obtain 

= E^-i )jt  / 

k k R 

+ J2Sj-l,k  f ^-j-l,k(X)^J,m(X)dx 


and 


T 

% 


lm  = / ®j-lAX)^-lm(X)dx  + 

L,  U J 


R 


Substituting  for  and  using  the  two-scale  equations,  we  have 


o; 


'-,P  R 

E £j-l,k  j ^ -i’2 k+p(x)®lm(x)d 


k,p 


R 


By  orthogonality,  this  reduces  to 
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—j,m  — y [ap]  I/]  ^m,2k+p  + ^ W M ^m,2/c+p 

k,p  k,p 

Finally,  this  can  be  simplified  to  the  form 

%,m  ~ —j-l,k  [ am-2k ] + Q-j-l  k 
k k 

Transposing  this  result,  the  following  reconstruction  formula  is  obtained: 

QLj,rn  = {[am-2k]T  aj-i,k  + [^m-2fc]r^._l  fc}  (3.156) 

k 

Note  that  equation  (3.156)  can  be  written  in  the  equivalent  form 

r 

= E E R-2X-1,* + €-2^5-1,  h (3.157) 

t=  1 k 

3.6  Tensor  Product  Wavelets 

To  this  point,  the  discussion  has  been  focused  on  wavelets  in  one  dimension. 
For  many  wavelet  families,  the  preceding  results  can  easily  be  extended  to  higher 
dimensions  using  tensor  product  wavelets.  In  this  section,  two-dimensional  tensor 
product  wavelets  and  their  associated  multiresolution  analyses  will  be  discussed  in 
detail.  In  particular,  decomposition  and  reconstruction  formulas,  similar  to  those 
derived  for  the  one-dimensional  case,  will  be  developed.  The  discussion  will  treat 
the  more  general  case  of  tensor  product  multiwavelets,  since  single-generator  wavelet 
families  can  be  viewed  as  specific  types  of  multiwavelets. 

Given  a multiresolution  analysis  {V)}jez  °f  multiplicity  r,  generated  by  r scaling 
functions  {01, . . . , 0r},  two-dimensional  scaling  functions  can  be  obtained  by  taking 
the  tensor  products  of  the  scaling  functions  in  one  dimension  (^-direction)  with  those 
in  the  other  dimension  (y-direction).  Hence,  a total  of  r2  two-dimensional  scaling 
functions  are  obtained: 
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i>^(x,y):=r(x)<t>‘(y)  (3.158) 

where  s,t  G {l,...,r}.  Given  r associated  wavelets  {ip1 , . . . ,xpr},  three  different 
types  of  tensor  product  wavelets  are  obtained,  taking  the  form 


:=  cps(x  )ipl{y) 

(3.159) 

T2-^(x,?/) 

:=  ips(x)cpt{y) 

(3.160) 

T3,(s,t)(x,  y) 

:=  ips(x)ip\y) 

(3.161) 

where,  once  again,  s,t  G {1, . . . , r}.  Therefore,  there  are  a total  of  3r2 

two-dimensional 

wavelets.  It  is  convenient  to  introduce  the  notation: 

XV]ik!rn)(X^y) 

■= 

(3.162) 

'■=  ^IkWtfj.miy ) 

tyHs,tK(x  v) 

:=  ^lk(XWj,m{y) 

where,  as  before, 

4>lkix)  = 

2 j/2ps  (2Jx  - k) 

(3.163) 

^j,k(x)  = 

2j/V  (2? x - k) 

(3.164) 

Note,  once  again,  that  there  are  three  types  of  wavelets.  The  superscript  (1,2,  or  3) 
indicates  whether  the  wavelet  is  formed  as  the  tensor  product  of  a scaling  function  in 
the  first  dimension  and  a wavelet  in  the  second  dimension  (type  1),  a wavelet  in  the 
first  dimension  and  a scaling  function  in  the  second  dimension  (type  2),  or  a wavelet 
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in  both  dimensions  (type  3).  In  addition,  each  of  the  functions  in  equation  (3.162) 
has  a superscript  (s,t)  which  indicates  the  individual  scaling  functions  or  wavelets 
that  compose  each  two-dimensional  function.  Just  as  in  the  one-dimensional  case, 
the  subscript  j represents  the  discretization  level.  In  two  dimensions,  we  have  a grid 
composed  of  2j  x square  elements.  Finally,  the  two-dimensional  scaling  functions 
and  wavelets  are  indexed  by  a translate  ( k,m ).  The  index  k indicates  the  position 
of  the  function  in  the  first  dimension  while  the  index  m gives  the  position  of  the 
function  in  the  second  dimension. 

Tensor  product  wavelets  are  amenable  to  multiresolution  analysis  in  much  the 
same  manner  as  their  one-dimensional  counterparts.  We  define  the  two-dimensional 
spaces 


Vi 

span j 

f $(*.*) 

1 j,(k,m)  ' 

s,t  6 {1,.. 

■.^}j 

fc,mez 

(3.165) 

w} 

:=  span j 

f nyli(Si*)  . 

s,t  £ {1, . . 

L.mez 

(3.166) 

Wj 

:=  span j 

r^2,(a,t)  . 

S,t  £ {1, . . 

->] 

^fc,m6Z 

(3.167) 

wf 

:=  span j 

f \Tj3,(s,t)  . 

s,t  e {l,.. 

.,n] 

(3.168) 

Then,  the  space  Vj  can  be  decomposed  as  follows: 

Vj  = Vj-i  ® © Wj_x  0 Wf_,  (3. 169) 

If  the  decomposition  is  carried  out  over  several  levels,  we  have 

j- 1 

Vj  = VK@{W}®W?®Wf}  (3.170) 

l=jo 

where  jo  is  the  coarsest  level  used  in  the  multiscale  expansion. 

A single-scale  representation  of  a two-dimensional  function  / in  terms  of  the 
basis  functions  of  Vj  can  be  written  as 
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M*,V)=  E (3-171) 

k,m,s,t 

where  are  constant  scaling  function  coefficients.  Equation  (3.169)  enables 

us  to  write  an  equivalent  two-scale  expansion  of  the  form 


fj(x,y) 


I]  Pj-l  %,m)yj-Uk,rn)(X’y) 

k,m,s,t 


fc,rn,s,£ 

+ Pj-l %,m)^'%,rn)(X’y)  + ,rn)  (X’V) 

k,m,s,t  k,m,s,t 


(3.172) 


where  are  constant  wavelet  coefficients.  A multiscale  expansion  of  fj,  cor- 

responding to  equation  (3.170),  can  be  written  as 


MW)  = A%,rn)Al,rn)(X>y)  + Y.  I] 

k,m,s,t  l=jo  k,m,s,t 

l=j o /c,rn,s,£  l— jo  k,m,s,t 

(3.173) 


Two-dimensional  decomposition  formulas  can  be  derived  in  a similar  manner  as 
the  one-dimensional  case.  A decomposition  formula  for  the  coefficients  {^-i  (k  m)} 
in  equation  (3.172)  can  be  obtained  by  multiplying  the  single-scale  and  two-scale 
representations  of  fj  by  c)  and  integrating  over  the  two-dimensional  space  R2: 


“lift (x>  (x>  y)dxdv 


( u,v ) 


R2 


k,m,s,t 


R2 


S aAA,m)AAk,rn)(X^  y)*j-ll(a,c)(X> 

k,m,s,t 
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+ / S y^dxdy 

j£2  k,m,s,t 

+ [ J2  V)dxdV 

,m,s,t 

Y1  $-i  ;?fc,m)^3-Mfc, »»)(*»  y)®t-u«Ax'  y^dxdy 


R2 


+ A t 

K2  k,m,s,t 

Using  the  expanded  form  of  the  functions  from  equations  (3.158)  through  (3.161)  and 
rearranging  the  integrals,  we  have 


J2  at(im)  j </>j,k(x)<f>j-l,a(x)dx  l <t>),m{yWj-l,c{v)dy  (3.174) 

fc,77l,S,t  jj^ 

= Y1  aj-l(k,m)  /^-l1fc(aj)^“-l,a(a;)<te  [ <t>j-l,m(y)<l>Vj-l,c(y)dy 


k,m,s,t  jj 

k.m.s.t 


R 


+ ^-l!(k,m)  J $-l,k(*)0“-l,a(aOda:  J ty-\,m{y)4>Vj-\,c(y)dy 

/j,77l,S,t  jjj  jj^ 

+ J tf-\,k(X)W-l,a(X)dx  J <%-l,m(y)<l>Vj-l,c(y)dy 

+ Pj-1  %,m)  J J tf-l,m(y)4lj-l,c(y)dy 

k,m,s,t 

Recall  that  multiwavelets  satisfy  the  following  orthogonality  properties  in  one  dimen- 
sion: 


(1) 

/ *lk 

J 

(xWj,m 

(x)dx  = 

s),(m,t) 

R 

(2) 

/ rjtk 

( x)dx  = 

= ^(fc, 

j 

R 

(3) 

(®)4 

(x)dx  = 

= 0 

V k,  m £ Z, 

(3.175) 

,r},  j <1 


Using  these  properties,  equation  (3.174)  can  be  reduced  to 
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k,m,s,t 


a 


(s,t) 


J <t>j,k(x)<l>j-l,a(X)dx  J $j,m(y)<i>Vj-lAy')dy 

K M 


and 


(u, 
j-l,(a,c) 


( s,t ) 


k,m,s,t 


j (x)<l>j-l,a(X)dx  j ^miyWj-^cWV 

K M 


Recall  that  we  have  the  following  two-scale  equations  for  multiwavelets: 


(3.176) 


= E°?'*W*)  (3.177) 

p,g 

q-uM  = E^W*)  <3-178) 

P,Q 

Then,  using  equation  (3.177)  to  substitute  for  and  0J_lc,  equation  (3.176) 

becomes 


(u,v 
j-  l,(o,c) 


E E 

k,m,s,tp,q,f,e 


nu<l>n- 
aj,(k,m)ap  a< 


T J tik(X)<t>9j,2a+p(X)dx  j 


A^mWhc+fWy 


Using  the  orthogonality  properties  once  again,  this  can  be  written  as 


( u,v 
j-l,(a,c) 


E\  ' ( s,t ) u.a  v.e  c sr 

/ , aj,(k,m)ap  af  <7(2a+p,9),(fe,s)0(2c+/.e),(m,t) 

k,m,s,t  p,q,f,e 


Finally,  we  obtain  the  following  decomposition  formula  for  the  scaling  function  coef- 


ficients on  level  j — 1: 


( v,,v 
j-l,(a,c) 


a 


u,s 

k—2a 


a 


v,t 

m—2c 


a 


(s,t) 
j,(k,m ) 


k,m,s,t 


(3.179) 
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In  a similar  manner,  the  other  two-dimensional  decomposition  formulas  can  be  de- 
rived: 


ol,(u,v) 

Pj—\,(a,c) 

V"  nu’s  hv't  ry(s,t) 

/ j ak-2aum-2cu:jl(k,m) 

(3.180) 

k,m,s,t 

n2,(u,v) 

Pj-l,(a,c) 

- V'  bU}S  nv,t 

(3.181) 

k,m,s,t 

o3,(u,u) 
Pj- l,(a,c) 

- Y'  bu's  hv,t  ry(s,t) 

/ , uk-2aum-2cLXjt(k,m) 

(3.182) 

k,m,s,t 


These  formulas  are  obtained  by  multiplying  equations  (3.171)  and  (3.172)  by  i’X), 
s,  and  respectively,  and  integrating  over  R2.  Once  again,  the  or- 

thogonality  properties  of  the  multiwavelets  and  the  two-scale  equations  are  used  to 
simplify  the  integrations.  The  above  decomposition  formulas  can  be  applied  recur- 
sively to  calculate  the  coefficients  that  appear  in  the  multiscale  expansion  of  fj  in 
equation  (3.170). 

A reconstruction  formula  for  the  single-scale  coefficients  can  be  de- 
rived by  multiplying  the  single-scale  and  two-scale  expansions  of  fj  by  and 

integrating  over  R2: 


X aUk,m)  f <t>lk(X)<Pla(X)dx  J <t>),m(y)$lc{y)dy 


(3.183) 


k,m,s,t 


E 


X rn)  / / ftj-l'mWlcWy 


k,m,s,t 


+ 


X Pj-1  %,rn)  J 4j-it(x)<t>j,a(x)dx  J ^_lim(y)^c(y)dy 


+ X Pj-\ %,m)  I 'Fj-l,k(x)<l>j,a{x)dx  I <t>)-lJvWiMdy 


k,m,s,t 


+ X ^3-i %,m)  J J tf-i,m(y)4>vj,c(y)dy 

k,m,s,t  K K 
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Note  that  the  expanded  form  of  the  scaling  functions  and  wavelets  has  been  used 
in  equation  (3.183).  By  the  orthogonality  properties  of  the  multiwavelets,  equation 
(3.183)  can  be  simplified  to 


a 


(u,v) 


KjTTljSjt 

+ Pj'-l %,m)  f tf-l,k(x)<t>la(x)dx  J tf-l,m(y)<l>lc(y)dy 

kjTTljSyt 

+ Y1  Pj-l %,m)  J tf-l,k(X)<l>j*(X)dx  J $-1  ,m(y)<f>ic(.y)dy 

/c,77l,S,t 

+ Pj-Uk ,m)  J J ^]-l,m{y  W3>c{y)dy 


k,m,s,t 


Using  the  two-scale  relations,  we  have 


E e«'«m/  0J.2  !k+p(X)<t>la(X)dx  J m+fWlcWy 

k,m,s,tp,q,f,e  R R 

+ E E avlhYP)-tLrn)  j tl2k+P(X)<t>UX)dX  j ^m+fWlcWy 

k,m,s,tp,q,f,e  R R 

+ E E wa‘'%%. »>/  J <Phm+,mAv)dy 

k,m,s,tp,q,f,e  R r 

+ E E J -t>u+ rmj*)**  / wwftw 

k,m,s,tp,q,f,e  r r 

Once  again  making  use  of  the  orthogonality  properties,  this  reduces  to 


a 


(u,v) 

J,(a,c) 


y~]  5~]  apQaf  ^j—l^kfTn) d(2k+p,q),(a,u)d(2m+f,e),(c,v) 

k,m,s,t  p,q,f,e 

+ apQbf  Pj-l](k,m)d(2k+p,q)Aa,u)d(2m+f,e),(c,v) 

k,m,s,t  p,q,f,e 

+ bpQaf  Pj’}i’JktTn')d(2k+p,q),(a,u)d(2m+f,e),(c,v) 

k,m,s,t  p,q,f,e 
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E 


+ 2_^  2-^  Pj’-l,(k,m)0(2k+P<q)<(a<u)0(2m+f’e)’(c<v) 

k,m,s,tp,q,f,e 


Finally,  the  following  reconstruction  formula  is  obtained: 


a 


(u,v) 

i.O.c) 


E { 

k,m,s,t 


s,u 

la-2k 


t,v 

ac-2m 


a 


( s,t ) 


+ a 


S,U 

a— 2k 


lt,v 

uc-2m 


/ol  ,(s,t) 


+b 


S,U 

a— 2k 


t,v 

ac-2m 


/o2,(s,t) 

^ j—l,(k,m) 


+ bS'U 


a— 2k 


i t,v  n3,(s,t) 
uc-2mlJj-l>(k,m) 


} 


(3.184) 


Single-generator  wavelet  families,  which  are  comprised  of  one  scaling  function  0 
and  one  associated  wavelet  0,  clearly  form  a subset  of  multi  wavelets.  In  this  case,  we 
have  one  two-dimensional  tensor  product  scaling  function  and  three  wavelets: 


:=  (f>(x)<l>{y) 

x,y ) 

:=  <t>(x)ip{y) 

V2(x,y) 

■=  iPix)(t>{y) 

V\x,y) 

:=  ip(x)ip(y) 

(3.185) 


A single-scale  approximation  of  a two-dimensional  function  / can  be  written  as 


fj(xi  U)  ^ 'j®j,(k,Tn)^j,(k,rn){xiy)  A 2 ' Pj,(k,m)^ j,(k,m)iXi  V)  (3.186) 

k,m  k,m 

d ^ ] Pj,(k,m)^ j,(k,m)(X’  V)  d"  2 ' P j ,{k ,m)^ j ,(k ,m)  (Xi  V) 
k,m  k,m 

and  an  equivalent,  multiscale  representation  is  given  by 


j- 1 

fj{xi  y)  y ^ (^jo,(k,rn)^jo,(k,m)  {xi  V)  d~  y ' / ) Pl,(k,m)^ l,(k,m)iX  > V) 

k,m  l=jo  k,m 


(3.187) 
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j- i j- 1 

+ XI  X]  Pl(k,m)^l(k,rii)(xi  y)  + Y1^2  ^l(k,m)^hk,m)(X^  V) 
l=jo  k,m  l=jo  k,m 

In  this  case,  the  decomposition  formulas  take  the  form: 


aj- l,(o,c) 

— ^ ^ ^k— 2a^m— 2cRj,(fc,m) 

k,m 

(3.188) 

— ^ ' Q'k—2a,bm—2c®j,(k,m) 

k,m 

(3.189) 

— ^ ^ bk—2a(km—2c^j,(k,m) 

k,m 

(3.190) 

= ^ ^ bk—2a.bm—2c(Xj,(k,m) 

k.m 

(3.191) 

and  the  reconstruction  formula  can  be  written  as 


Rj',(a,c)  ~~  ^ 2k^c— 2m^j—  l,(fc,m)  d-  ffii— 2k^c—  2mPj— l,(fc,m)  (3.192) 

k,m 

d~^a— 2k^c— 2mPj— l,(fc,m)  d~  ^a— 2k^c— 2mPj—ll^k1m)  } 

Recall  that,  while  in  the  general  multiwavelet  case,  the  scaling  function  and  wavelet 
filters  are  matrices,  in  the  single-generator  case  they  reduce  to  scalars. 

As  a final  note,  it  is  not  difficult  to  extend  the  results  given  in  this  section  to 
higher-dimensional  tensor  product  multiwavelets.  The  notation  necessarily  becomes 
more  cumbersome,  but  the  concept  remains  the  same.  In  general,  for  a multiwavelet 
family  generated  by  r scaling  functions,  there  are  a total  of  rn  n-dimensional  tensor 
product  scaling  functions  and  3rn  associated  wavelets. 


CHAPTER  4 

MULTIWAVELET  CONSTRUCTIONS 


In  this  chapter,  piecewise-polynomial  multiwavelets  are  constructed  from  classi- 
cal finite  elements  using  the  technique  of  intertwining  multiresolution  analyses.  These 
multiwavelets  are  orthonormal,  compactly  supported,  and  possess  symmetry  or  an- 
tisymmetry. Furthermore,  they  can  be  constructed  to  have  arbitrary  approximation 
order  and  are  easily  adapted  to  boundaries.  In  two  dimensions,  the  resulting  ten- 
sor product  multiwavelets  are  easily  adapted  to  the  square  domain  over  which  the 
symmetric  form  of  the  second-order  Volterra  kernel  is  supported.  In  Section  4.1,  the 
technique  of  intertwining  multiresolution  analyses  is  introduced.  It  is  shown  that 
one  intertwining  is  not  sufficient  to  generate  multiwavelets  with  desirable  approxima- 
tion properties  from  lower-order  finite  elements.  In  Section  4.2,  it  is  demonstrated 
that,  as  noted  by  Donovan,  Geronimo,  and  Hardin  [2],  by  performing  two  successive 
intertwinings  of  multiresolution  analyses,  multiwavelets  that  are  suitable  for  approx- 
imation can  be  derived  from  these  finite  elements.  This  technique  is  demonstrated  in 
the  construction  of  piecewise-quadratic  and  piecewise-cubic  multiwavelets  in  Sections 
4.3  and  4.4,  respectively.  In  Section  4.5,  a theorem  is  given  that  generalizes  these  re- 
sults for  the  construction  of  piecewise-polynomial  wavelets  of  arbitrary  approximation 
order.  In  order  to  use  these  piecewise-polynomial  multiwavelets  to  represent  first  and 
second-order  Volterra  kernels,  it  is  necessary  to  adapt  the  functions  to  the  domains 
over  which  the  kernels  are  supported.  Therefore,  boundary-adapted  multiwavelets 
are  discussed  in  Section  4.6.  Finally,  in  Section  4.7,  an  example  is  given  in  which  the 
boundary-adapted  multiwavelets  are  used  to  decompose  a one-dimensional  function. 
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4.1  Multiwavelets  and  Multiresolution 

Recall,  from  Chapter  3,  that  a multiresolution  analysis  of  multiplicity  r is  defined 
as  a family  of  subspaces  {V^}  -6Z  of  L2(M)  satisfying  the  following  conditions: 


(1) 

• • ■ V-2  C VCi  C Vo  C Ri  C R2  • 

(2) 

Z) 

ii 

o' 

jez 

(3) 

U V>  = ^ (R) 

(4.1) 

jez 

(4) 

/ (x)  6 Vj  f (2x)  £ Vj+1 

(5) 

f(x)eVj=^f{x-k)eVj 

V k £ Z 

(6) 

There  are  r functions  {01, . . . 

, 0r } such  that  the  set  of  translates 

{cf)s (x  — k)  : s = 1, ... , r}keZ  forms  a Riesz  basis  for  Vo 


The  functions  {01, . . . , 0r}  are  the  scaling  functions,  or  generators,  of  the  multireso- 
lution analysis.  Each  of  the  spaces  {Vj}  above  is  an  example  of  a finitely-generated 
shift-invariant  (FSI)  space.  As  noted  in  Ref.  [2],  there  is  clearly  some  flexibility  in 
the  selection  of  the  space  Vo  hr  any  given  multiresolution  analysis.  In  this  devel- 
opment, the  generators  of  Vo  are  chosen  to  have  support  in  [—1,1].  This  simply 
amounts  to  shifting  the  multiresolution  sequence  {Vj}  so  that  we  consider  {K}seZ, 
where  Vs  = Vj+m  for  some  fixed  m £ Z.  In  addition,  it  will  be  required  that  the 
families  of  generators  {01, . . . , 0r}  are  minimally  supported  in  [—1,1], 

Definition  4.1  [2]:  Suppose  that  {c/)1, . . . , 0r}  are  generators  of  a multiresolution  analy- 
sis, where  {01, . . . , cj)k}  are  supported  on  [—1, 1]  and  the  remaining  {</>fc+1, . . . , (pr  } are 
supported  on  [0, 1].  The  generators  {01, . . . , 4>r}  are  minimally  supported  on  [—1, 1] 
provided  that 
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(1) 

, . , are  linearly  independent  on  [0, 1] 

(2) 

{01,.. 

. . , 0fc}  are  linearly  independent  on  [—1,  0] 

(3) 

{0}  = 

span  ({01  (x)  X[o,i]  (x) , . . . , (f)k  (x)  x[0,i]  (z) } 

n w 0*0  x[o,i]  (x  - 1) , . . . , ^ {x)  x[0,i]  (x  - 1)}) 

Note  that  X[o,i]  denotes  the  characteristic  function  over  [0, 1],  defined  as 


X[o,i](x) 


1 


x e [o,  l] 


(4.2) 


0 otherwise 

The  advantage  of  working  with  minimally-supported  families  of  generators  is  that 
the  restrictions  of  shifts  of  the  generators  {(f)1 , . . . , (jf } to  any  interval  are  linearly 
independent.  That  is, 


are  all  linearly  independent  on  [0, 1]. 

The  fundamental  tool  that  will  be  used  in  the  construction  of  orthogonal  piecewise- 
polynomial  multiwavelets  is  the  technique  of  intertwining  multiresolution  analyses. 
Suppose  that  {01, . . . , cjf}  are  generators  of  a multiresolution  analysis  that  is  mini- 
mally supported  on  [—1, 1].  We  define  the  spaces 


A(V0 ) :=  span{(ps  : s = k + l,...,r}  (4.3) 

Ba(V0)  :=  span  ({0S(-  - a)x[o,i]  : s = 1, . . . , fc}  (J  A(V0)^  , a = 0, 1 (4.4) 
C*(V0)  :=  Ba(V0)eA(V0 ),  a = 0,1  (4.5) 
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The  process  by  which  intertwining  multiresolution  analyses  are  constructed  is  defined 
in  terms  of  the  subspaces  A(V0),  Ba(Vo),  and  CCT(V o)  defined  above  and  the  following 
lemma  from  Ref.  [2]: 

Lemma  4.1  (Donovan,  Geronimo,  and  Hardin  [2]):  Let  {V^}  -GZ  be  a multiresolution 
analysis  generated  by  r scaling  functions  {01, . . . , (jf } that  are  minimally  supported 
on  [—1,1]-  Suppose  that  there  is  a subspace  W C A(Vi)  0 .A(Vo)  such  that 

(/  - Pw)  C0  (Vo)  -L  (/  - Pw)  Cx  (V0)  (4.6) 

where  Pw  is  the  orthogonal  projection  onto  W . Suppose  that  {nq, . . . , Wk)  is  a basis 
for  W.  Then,  the  family  of  functions  {<j)1 , . . . , 0r,  nq, . . . , Wk)  generates  an  orthogonal 
multiresolution  analysis  (V)}  that  intertwines  {Vj}  in  the  sense  that 

■■■VoCVoCVw 

The  method  by  which  this  theorem  can  be  utilized  to  generate  orthogonal  multi- 
wavelets is  simple,  at  least  in  principle.  We  begin  by  considering  the  multiresolution 
analysis  {Vj}  that  is  generated  by  classical  Cq  finite  elements  supported  on  [—1,1]. 
We  then  seek  a basis  {rci, . . . , Wk}  for  A (Vi)  0 A (Vo)  that  satisfies  equation  (4.6).  If 
such  a basis  can  be  found,  the  desired  orthogonal  multiresolution  analysis  has  been  ob- 
tained. This  construction  is  depicted  schematically  in  Figure  (4.1).  Unfortunately,  in 
many  cases  of  interest,  the  subspace  A (V\)QA  (V0)  is  not  large  enough  to  design  mul- 
tiwavelets that  exhibit  desirable  properties  such  as  symmetry  or  antisymmetry.  For 
example,  the  construction  carried  out  in  Ref.  [2]  yields  smooth  piecewise-polynomial 
multiwavelets  that  are  of  cubic  order  or  higher.  The  lack  of  lower-order  orthogonal 
piecewise-polynomial  multiwavelets  can  be  attributed  precisely  to  the  small  dimen- 
sion of  A (Vi)  0 A (Vq). 
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e t t, 


Figure  4.1:  Intertwining  Multiresolution  Analyses 

To  explain  the  role  of  the  dimension  of  A(V\)  © A (Vo)  in  the  construction  of 
multiwavelets,  consider  the  two  examples  depicted  in  Figures  (4.2)  and  (4.3).  Figure 
(4.2)  depicts  the  spaces  A (Vo),  A(Vi),  and  A (V 2)  for  the  case  in  which  we  seek  to 
construct  intertwining  multiresolution  analyses  from  piecewise-linear  finite  elements. 
Figure  (4.3)  depicts  the  corresponding  spaces  for  classical  quadratic  finite  elements.  In 
either  case,  the  dimension  of  the  space  A ( V\)qA  (Vo)  is  2.  If  we  seek  a subspace  W C 
A (Vi)  © A (Vo)  that  satisfies  equation  (4.6),  the  dimension  of  W is  1.  In  other  words, 
the  subspace  W is  completely  determined  by  the  choice  of  the  original  nested  spaces 
of  piecewise  polynomials  in  Figures  (4.2)  and  (4.3),  and  the  orthogonality  condition 
in  equation  (4.6).  As  a result,  there  is  no  freedom  to  design  the  resulting  basis  for  W 
to  be  reasonable  for  approximation  purposes.  In  fact,  Example  (3.1)  in  Ref.  [2]  carries 
out  this  analysis  starting  with  piecewise-linear  finite  elements.  The  scaling  functions 
and  wavelets  associated  with  the  resulting  orthogonal  multiresolution  analysis  are 
neither  symmetric  nor  antisymmetric.  Moreover,  they  do  not  exhibit  appropriate 
interpolatory  properties  at  the  element  boundaries  and  are  difficult  to  employ  for 
general  classes  of  boundary  conditions.  They  simply  are  not  attractive  candidates  for 
approximation. 

4.2  Construction  of  Intertwining  Multiresolution  Analyses 
From  the  preceding  discussion,  it  is  apparent  that  the  construction  of  intertwining 
multiresolution  analyses  from  lower-order  finite  elements  does  not  yield  functions  with 
good  approximation  properties.  As  noted  in  Refs.  [2]  and  [3],  this  problem  can  be 
addressed  by  creating  two  successive  intertwinings  of  multiresolution  analyses.  The 
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Figure  4.2:  Generators  Supported  on  [0,1]:  Linear  FEM 


Figure  4.3:  Generators  Supported  on  [0,1]:  Quadratic  FEM 


128 


second  intertwining  effectively  gives  another  degree  of  freedom  to  the  design  process. 
In  this  section,  this  procedure  is  outlined  for  the  goal  of  constructing  piecewise- 
polynomial  multiwavelets  from  the  classical  finite  element  spaces. 

The  starting  point  for  the  procedure  is  the  finite  shift-invariant  (FSI)  spaces 
{Vj}jez  associated  with  the  classical  finite  element  spaces  of  a particular  order.  Given 
r generators  {01, . . . , 0r},  the  FSI  spaces  are  defined  as 

Yj  span  {4>jk  : s = 1, . . . , r}fcgZ  (4.7) 

where  we  have  the  usual  notation 

4>lk{x)  = 2j/2(f)s  ( 2jx  - k)  (4.8) 

The  first  step  in  the  intertwining  process  is  to  select  a function  w 6 V\ . Then,  a new 
set  of  generators  is  defined  as 

{</>\  • • • , <^r+1}  :=  {0\  • • • , 0r,  w)  (4.9) 

for  a new  multiresolution  analysis  {Vj}jez-  This  multiresolution  analysis  satisfies 

Vo  C Vo  C Vi  (4-10) 

By  the  definition  of  a finite  shift-invariant  space,  we  also  have 


Vq  C Vo  C Vi  C Vi---  (4.11) 

and  {Vj}  is  a multiresolution  analysis  that  intertwines  {Vj}.  Note  {Vj}  is  not  con- 
structed such  that  it  is  an  orthogonal  multiresolution  analysis.  If  that  were  the  case, 
the  dimensionality  of  A (Vi)  © A (Vo)  would  severely  restrict  the  design  of  w,  and 
hence  of  the  multiresolution  analysis  {Vj}. 
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After  the  construction  of  {V^},  a second  intertwining  is  performed.  That  is, 
another  function  w EX' i is  chosen  and  a new  set  of  generators  is  defined: 

{01 (4.12) 

In  this  case,  w is  chosen  to  satisfy  the  conditions  of  Lemma  (4.1).  In  other  words,  we 
seek  w € A(Vi)  0 A(Vo)  such  that 

(/  - Pw)  C0(V0)  -L  (/  — Pw)  Cx(y0)  (4.13) 

With  the  second  intertwining,  we  have 

Vo  C V0  c Vi  (4.14) 

and  {V j}  is  an  orthogonal  multiresolution  analysis.  In  terms  of  the  original  multires- 
olution {Vj},  we  have 


v0  c v0  c y0 

Vi  C Vi  C Vi  (4.15) 

v2  c v2  c ^2 

and 

hoC^C  (4.16) 

Therefore,  by  performing  two  intertwinings  of  multiresolution  analyses,  an  orthogonal 
multiresolution  analysis  can  be  obtained.  Donovan,  Geronimo,  and  Hardin  [2]  have 
demonstrated  this  procedure  in  deriving  piecewise-linear  multiwavelets  from  the  clas- 
sical linear  finite  element  spaces.  The  derived  piecewise-linear  scaling  functions  and 
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Scaling  Function  01  Scaling  Function  02 


Scaling  Function  </>3  Scaling  Function  </>4 

Figure  4.4:  Piecewise-Linear  Scaling  Functions  [2] 


wavelets  are  depicted  in  Figures  (4.4)  and  (4.5).  The  corresponding  scaling  function 
and  wavelet  filter  matrices  are  listed  in  Tables  (4.1)  and  (4.2),  respectively.  In  the 
following  sections,  the  technique  of  successive  intertwinings  of  multiresolution  analy- 
ses is  demonstrated  for  the  construction  of  piecewise-quadratic  and  piecewise-cubic 
multiwavelets. 


4.3  Multiwavelets  From  Second-Order  Finite  Elements 
The  principles  discussed  in  the  last  section  can  be  used  to  construct  orthogonal 
multiresolution  analyses  from  classical  quadratic  finite  element  spaces.  When  viewed 
as  a finite  shift-invariant  space,  the  classical  quadratic  finite  element  spaces  can  be 
defined  in  terms  of  the  two  generators  {(f)1,  cf)2}  depicted  in  Figure  (4.6).  Following 
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Wavelet  ip3  Wavelet  ip4 

Figure  4.5:  Piecewise-Linear  Multiwavelets  [2] 


Classical  Quadratic 
Finite  Element 


Generators  for  Piecewise 
Quadratic  PP  FSI  Space 


Figure  4.6:  Quadratic  Piecewise-Polynomials:  FSI  Space  and  FEM  Basis 
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Table  4.1:  Piecewise-Linear  Scaling  Filter  Matrices  [2] 


-2/2+/7  — 65+14/TT~ 

24/6  168/6 

0 0 


[a- 2 = 


0 

0 

0 

0 


1 


14/6 

0 


0 


0 


0 0 

-2/2+/T  79+14/14 


0 

0 


[«-i]  = 


14V2 

0 


24/6 

0 


168/6 

0 


13 

14/6 

0 


0 

0 

0 

0 

0 

0 

0 

0 

1 

-2/2+/7 

79+14/n 

13 

/2 

24/6 

168/6 

14/6 

0 

3 

3 

1 

4/2 

4/14 

/l4 

0 

vT 

9 

3 

4 

28/2 

7/2 

0 

/2+4/7 

-4+/I4 

0 

12/2 

12/2 

L 

4-/14 

65/2-28/7 

M = 


M = 


14\/2_ 
_3_ 
14 

Vi 

7 

0 


48/3 

3 

4/2 
7 


336/3 

3 

4/U 


12 


28/2 

-2/2+/7 


14/6 

1 

/l4 

3_ 

7/2 


_+2_ 


Table  4.2:  Piecewise-Linear  Multiwavelet  Filter  Matrices  [2] 


-2/2+/7 

6/6 

0 

-2/2+/F 

24/6 

-4+/I4 

24/6 


1+2/14 

6/6 

2/2 

7 

—65+14/14 

168/6 

-65/2+28/7 

168/6 


[6-2]  = 


0 

2/6 

7 

1 


14/2 

14 


1 

/6 

3 

7/2 

1 

14/6 

1 


14/3 


[6-l]  = 


0 

0 


0 

0 


1 — 2/2+/7 

’ 14/2  24/6 

_ J_  -4+/14 

14  24/6 


0 

0 

79+14/I4 

168/6 

79/2+28/7 


0 

0 

13 

14/6 

13 


0 0 0 

-2/2+/7 

1+2/14 

1 

6/6 

0 

-2/2+/P 

24/6 

-4+/l4 

24/6 

6/6 

2/2 

7 

79+14/14 

168/6 

79/2+28/7 

168/6 

/6 

3 

7/2 

13 

14/6 

13 

14/3 

0 

-2/2+/7 

6/6 

1+2/14 

6/6 

1 

/6 

2/6 

7 

0 

2/2 

7 

3 

7/2 

1 

-2/2+/7 

-65+14/14 

1 

14/2 

24/6 

168/6 

14/6 

1 

— 4+/l4 

-65/2+28/7 

1 

14 

24/6 

168/6 

14/3 

M = 


M = 
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the  procedure  outlined  in  the  last  section,  we  choose  £ V\  as  depicted  in  Figure 
(4.7),  where 


4>3(x)  — 02(2x)  — 02(  2x  — 1)  (4-17) 

By  inspection,  </>3  is  orthogonal  to  q i2  and  we  simply  choose  (j) 2 :=  02.  Both  02  and 
03  are  normalized  to  unit  magnitude,  and  01  is  constructed  by  applying  the  Gram- 
Schmidt  orthogonalization  procedure  to  01: 


4>l(x)  = 4>x(x)  — , 02(-  + 1)^  <t>2(x  + 1)  — (fi1 , 03(-  + 1))  4>Z{x  + 1) 

- (</>\  </>2)  4>2(x)  ~ (01,  </>3}  ) (4-18) 


where  (-,  •)  denotes  the  standard  L2(R)  inner  product.  In  this  fashion,  a first  in- 
tertwining multiresolution  analysis  is  defined  where  Vo  is  defined  in  terms  of  the 
generators  < 01,  </>2,  </>3  >: 


Vq  :=  span  |0S(-  — k)  : s = 1,  2,  3 j 


fcez 


(4.19) 


and,  by  construction, 


Vo  C Vo  C V1 

At  this  point,  it  is  important  to  note  that  the  spaces  {V)}  do  not  form  an  orthogonal 
multiresolution  analysis.  In  fact,  (j)1  is  not  orthogonal  to  01(-  — 1). 

Proceeding  to  the  second  intertwining,  a function  w is  selected  as 


w(x)  = 01(2x  — 1)  + a 02(2x)  + 4>2{2x  — 1)  + b 4>3(2x)  — </>3( 2x  — 1) 


(4.20) 
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Scaling  Function  03 

Figure  4.7:  Generators  for  First  Intertwining,  Nonorthogonal  Multiresolution  Analy- 
sis 
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Recall  that  A J is  defined  as  the  space  spanned  by  all  functions  in  V\  whose 
support  is  contained  in  the  interval  [0, 1].  Clearly,  then,  w G A(Vi).  Moreover,  w is 
symmetric  and  automatically  orthogonal  to  </>3.  Note  that  A(V o)  = span{4>2,  03}.  To 
apply  Lemma  (4.1),  it  is  required  that  w G A(V j)  © A(Vq).  Therefore,  one  condition 
that  must  be  satisfied  by  w is 


(w^2)=0  (4.21) 

The  function  w in  equation  (4.20)  defines  a one-dimensional  space  that  is  parameter- 
ized by  two  constants,  a and  b.  An  additional  condition  is  needed  to  determine  w, 
and  this  is  provided  by  Lemma  (4.1): 

(/  - Pw)  Co  (Vo)  T (/  - Pw)  C i (Vo)  (4.22) 

For  the  problem  at  hand,  this  condition  is  simple  to  impose.  By  the  definitions  of 
the  functions  j^1,^2,^3!  and  the  spaces  Co  and  C\  we  have 

Co(x)  :=  C0  (V0)  = 01(x)x[o,i](a:)  (4.23) 

Ci 0*0  :=  Ci  = 4>x{x  - l)x[0,i](x)  (4.24) 

The  orthogonality  condition  in  equation  (4.22)  can  then  be  written  as 


((/  — Pw)  Co,  (I  — Pw)  Ci)  — 0 


(4.25) 


which  can  be  expanded  to 


(co,  ci)  -(-  ( PwCq , Pwcx)  — (cq,  PwC\ ) — (Pwco,  Ci)  — 0 (4-26) 


For  any  / G L'2( R),  the  orthogonal  projection  Pw  is  given  by 


Substituting  the  expression  for  the  projection  into  equation  (4.26),  we  obtain 


(c0,  Cl) 


(w,Co)w  ( W,Ci)w 


which  reduces  to 


Co, 


{w,ci)w' 
(w,w)  t 


(w,  Co)  w 
(w,  w) 


, Cl  ) = 0 


(w,  w)  (c0,  Cl)  - (w,  Co)  (w,  Cl)  = 0 (4.28) 

With  the  solution  of  equations  (4.21)  and  (4.28)  for  the  constants  a and  b in  equation 
(4.20),  the  function  w is  then  normalized.  Finally,  Lemma  (4.1)  guarantees  that  the 
translates  of  01  will  be  mutually  orthogonal  if  01  is  orthogonal  to  the  translates  of  w. 
Therefore,  applying  the  Gram-Schmidt  procedure,  we  obtain 

0J(x)  = 4>l{x)  — (ft,  w(-  + 1)^  w(x  + 1)  — w(x)  (4.29) 

The  function  01  is  then  normalized  and  we  define 


02  ;=  02 
03  :=  03 
04  :=  w 

Then,  the  final  orthogonal  multiresolution  analysis  {V j}j<=  % is  obtained,  where 

V0  :=  span  { 4>s(-  — k)  : s — 1, . . . , 4}  (4.30) 

1 j kez 

Now  that  an  orthogonal  multiresolution  analysis  has  been  generated  using  the 
technique  of  interwining,  the  double  tilde  notation  is  dropped  for  the  multiresolution 
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Scaling  Function  01 


Scaling  Function  (j)2 


Figure  4.8:  Generators  for  Final  Intertwining,  Orthogonal  Multiresolution  Analysis 


analysis  defined  in  equation  (4.30).  The  generators  {(f)1,  </>2,  03,  04}  are  depicted  in 
Figure  (4.8).  The  approximation  spaces  {Vj}  are  then  defined  as 


Vj  :=  span  {tf£fc(x)  : s = 1, . . . , 4}fcgZ  (4.31) 

where 


= 2"V  (Vx  - k) 


(4.32) 


The  scaling  filter  matrices  associates  with  these  generators  are  listed  in  Table  (4.3). 
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Table  4.3:  Piecewise-Quadratic  Scaling  Filter  Matrices 


n 

-2 

0 

.0163446406689 

-.0342630216987 

-.0327571725062 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

.113474174186 

.0163446406689 

-.422172442890 

-.236881317738 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

.707106781187 

.0163446406689 

.422172442890 

-.236881317738 

0 

.618718433538 

-.220970869121 

-.0988211768803 

0 

.707106781187 

0 

0 

0 

-.338665211381 

-.316332942563 

-.201862521835 

1 

.113474174186 

.0163446406689 

.0342630216987 

-.0327571725062 

.342326598441 

.618718433538 

.220970869121 

-.0988211768803 

0 

-.707106781187 

0 

0 

.699272287923 

-.338665211381 

.316332942563 

-.201862521835 

Now  that  a set  of  orthonormal  scaling  functions  has  been  constructed,  it  is  not 
difficult  to  derive  an  associated  set  of  wavelets.  Clearly,  from  the  two-scale  equations, 
each  wavelet  must  be  composed  of  a linear  combination  of  scaling  functions  that 
span  V\.  Furthermore,  the  space  W0  is  the  orthogonal  complement  to  Vo,  so  the 
wavelets  must  be  constructed  so  that  they  are  orthogonal  to  all  translates  of  the 
scaling  functions  {(p1,  02, (p3 , 04}. 

First,  a symmetric  wavelet,  supported  over  [—1,1],  is  constructed.  We  choose 


^\x)  = 4>\,o(x)  ~ (4>\,o,  01)  4>\*)  (4-33) 

Note  that  the  Gram-Schmidt  procedure  has  been  used  to  construct  ip1  such  that  it  is 
orthogonal  to  <pl . By  definition  in  equation  (4.32),  the  function  is  a scaled  and 
contracted  form  of  (ft1,  also  symmetric  about  the  origin  and  supported  over  [—  |,  |]. 
Note  that,  because  the  remaining  scaling  functions  {<p2 , (p3 , <pA}  are  supported  over 
[0,1],  there  is  no  component  of  <p\Q  in  these  functions.  Consequently,  they  are  or- 
thogonal to  <p\0.  The  symmetric  wavelet  ip1  is  then  normalized. 
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Next,  an  antisymmetric  wavelet  that  is  supported  on  [—1, 1]  is  constructed.  We 
start  with  a symmetric  function 


z(x ) = 4>l{x ) - (cp1,  (p\fi } <j>\fi(x)  (4.34) 

The  contribution  of  (p\Q  is  removed  from  cp1  in  order  to  obtain  a function  that  passes 
through  the  origin.  The  function  z is  made  antisymmetric  by  multiplying  by  the 
function  H , defined  as 


We  obtain 


H(x)  :=  1 


x e [—1,  o) 
X e [o,  l] 
otherwise 


(4.35) 


V2h‘)  = H(x)z(x)  = H(x)  [4>'(x)  - (01,  <Al,0)  (4-36) 

Since  ip2  is  an  antisymmetric  function,  it  is  guaranteed  to  be  orthogonal  to  (pl  and 
ip1.  By  construction,  the  translates  of  the  functions  ip2,  ip1,  and  cp1  are  all  mutu- 
ally orthogonal.  Clearly,  ip 2 is  also  orthogonal  to  the  remaining  scaling  functions 
{( p 2,  cp3,  cp4 } since  both  (p 1 and  (p\  0 are  orthogonal  to  these  functions. 

In  order  to  complete  the  set  of  wavelets,  two  wavelets  that  are  supported  over 
[0, 1]  are  constructed.  Recall  that  these  wavelets  must  be  formed  as  linear  combina- 
tions of  the  scaling  functions  that  span  V\  and  have  support  contained  in  [0,  lj.  This 
implies  that  the  remaining  wavelets  take  the  form 


ip\x)  = acp\  0(x)  + b(p\  Q{x)  + ctp\ fi(x)  + dcp\}l(x) 
+ecp\l{x)  + f(p41}i(x)  + g<p\t i(x),  t = 3, 4 


(4.37) 
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where  the  seven  constants  {a,b,c,d,e,  /,  g}  must  be  determined.  The  wavelets  must 
be  constructed  such  that  they  are  orthogonal  to  the  translates  of  the  scaling  functions 
{01,  02,  03,  04}.  This  requirement  provides  five  orthogonality  constraints  for  ip1,  t = 
3,4: 


WAS)  = 0,  s = 4 (4.38) 

(^,01(--1))=O 

Note  that  since  there  are  five  orthogonality  conditions  and  seven  unknown  constants, 
there  is  some  freedom  in  the  design  of  ip3  and  ip4  as  long  as  it  is  ensured  that  they  are 
linearly  independent.  Finally,  the  Gram-Schmidt  procedure  is  used  to  make  the  two 
functions  orthogonal.  In  this  case,  ip3  has  been  constructed  to  be  an  antisymmetric 
function  and  ip4  has  been  constructed  to  be  a symmetric  function. 

We  now  have  a set  of  orthonormal  wavelets  {ip1,  ip2,  ip3,  ip4}  whose  translates 
form  a basis  for  the  orthogonal  complement  space  Wo.  These  wavelets  are  depicted 
in  Figure  (4.9).  The  wavelet  spaces  { Wj } are  defined  as 

Wj  :=  span  : s = 1, . . . , 4}fceZ  (4.39) 

where 

ipajik{x)  = (2jx  - k)  (4.40) 

Table  (4.4)  lists  the  wavelet  filter  matrices  associated  with  these  piecewise-quadratic 
multiwavelets. 


4.4  Multiwavelets  From  Third-Order  Finite  Elements 
Using  the  same  procedure  that  was  applied  in  the  quadratic  case,  an  orthogonal 
multiresolution  analysis  can  be  constructed  from  the  classical  cubic  finite  element 
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Wavelet  ip3  Wavelet  ip4 


Figure  4.9:  Piecewise-Quadratic  Multiwavelets 
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Table  4.4:  Piecewise-Quadratic  Multiwavelet  Filter  Matrices 


n 

[bn] 

-2 

0 

-.0163446406689 

.0342630216987 

.0327571725062 

0 

-.0231148125061 

.0484552299729 

.0463256376237 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

-.113474174186 

-.0163446406689 

.422172442890 

.236881317738 

-.160476716113 

-.0231148125061 

.597041994395 

.335000772219 

0 

0 

0 

0 

0 

0 

0 

0 

0 

.707106781187 

-.0163446406689 

-.422172442890 

.236881317738 

0 

.0231148125061 

.597041994395 

-.335000772219 

0 

0 

-.288675134594 

-.645497224366 

0 

.0188731653801 

.223959608749 

.551430926265 

1 

-.113474174186 

-.0163446406689 

-.0342630216987 

.0327571725062 

.160476716113 

.0231148125061 

.0484552299729 

-.0463256376237 

0 

0 

-.288675134594 

.645497224366 

.539276980493 

.0188731653801 

-.223959608749 

.551430926265 

spaces.  When  viewed  as  finite  shift-invariant  spaces,  the  classical  cubic  finite  element 
spaces  are  defined  in  terms  of  the  three  generators  {(f)1,  (f>2,  (f)3}  depicted  in  Figure 
(4.10).  Because  it  is  more  convenient  to  work  with  symmetric  and  antisymmetric 
functions,  an  alternate  set  of  generators  {(f)1,  p2,  p3}  is  chosen,  where 


p2  = 02  + </>3  (4.41) 

p3  = 02-03  (4.42) 

These  functions,  shown  in  Figure  (4.11),  are  orthogonal  due  to  symmetry.  Clearly, 
since  p2  and  p3  are  both  formed  as  linear  combinations  of  <f) 2 and  (f>3 , {(f)1,  p2,  p3}  is 
also  a basis  for  the  classical  cubic  finite  element  spaces. 

Proceeding  to  the  first  intertwining  multiresolution  analysis,  the  functions  p2 
and  p3  are  normalized  and  we  define 
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Scaling  Function  03 


Figure  4.10:  Generators  for  the  Classical  Cubic  Finite  Element  Spaces 
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Scaling  Function  01 


Scaling  Function  </?2 


Scaling  Function  <^3 


Figure  4.11:  Alternate  Generators  for  the  Cubic  Finite  Element  Spaces 
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Next,  a function  04  is  selected  that  is  a linear  combination  of  the  functions  that  span 
A(Vx).  We  choose  04  to  be  an  antisymmetric  function  of  the  form 


4>2(2x)  — 02(2x  — 1)  + (p3(2x)  + 03(2x  — 1) 


(4.43) 


By  symmetry,  04  is  orthogonal  to  </>2.  Then,  04  is  made  orthogonal  to  <56 3 by  applying 
the  Gram-Schmidt  orthogonalization  procedure: 


04(x)  = </>4(x)  - 03(x)  (4.44) 

The  function  04  is  normalized  and  01  is  constructed  using  the  Gram-Schmidt  proce- 
dure: 


ftix)  = 4>l{x)  - ^01,02(-  + i)^4>2(x  + 1)  - {V,</>3(-  + i)^4>3(x  + 1) 

- (V , 04(-  + 1))  </>40  + 1)  - (</>\ </>2}  02O) 

- 03^  03(a;)  - ((f)1, 04^  04(z)  (4.45) 

A first  intertwining  multiresolution  analysis  (V)}  can  now  be  defined  in  terms  of  the 
generators  jc/)1, 02,  03,  04|,  where  the  space  Vo  is  defined  as 

Vo  :=  span  — k)  : s = 1, . . . , 4}  (4.46) 

l J fee z 

The  generators  for  the  first  intertwining  multiresolution  analysis  are  shown  in  Figure 
(4.12).  Note  that,  just  as  in  the  quadratic  case,  this  first  intertwining  does  not 
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Scaling  Function  03  Scaling  Function  04 

Figure  4.12:  Generators  for  the  First  Intertwining,  Nonorthogonal  Multiresolution 
Analysis 


generate  an  orthogonal  multiresolution  analysis  since  the  translates  of  01  are  not 
mutually  orthogonal. 

Proceeding  to  the  second  intertwining,  a symmetric  function  w E A (v^  is 
defined.  This  function  is  chosen  to  take  the  form 


w(x)  = 4>l{2x  — 1)  + a 4>2(2x)  + </>2(2x  — 1)  +6  4>3{2x)  — 03(2x  — 1)  (4.47) 


Because  w is  symmetric,  it  is  automatically  orthogonal  to  the  antisymmetric  functions 
03  and  04.  It  is  required  that  w also  be  orthogonal  to  02,  so  w must  satisfy  the 


constraint 


147 


(w,  = 0 (4.48) 

If  equation  (4.48)  is  satisfied,  then  w € A(V i)  0 A(V0).  Recall  that  Lemma  (4.1) 
provides  an  additional  condition  that  w must  satisfy.  We  require 

(/  - Pw)  C0  (Vo)  1(1-  Pw ) Cx  (v0)  (4.49) 

Similar  to  the  quadratic  case,  the  spaces  Co(Vo)  and  C\(Vq)  are  one-dimensional 
spaces  defined  as 


cQ(x)  :=  C0  (v0^j  = 4>1(x)x[o,i](x)  (4.50) 

Ci(x)  :=  Ci  (Ro)  = 4>X(x  - l)x[o,i](®)  (4-51) 

Just  as  in  the  quadratic  case,  the  orthogonality  condition  in  equation  (4.49)  can  be 
reduced  to 


(w,  w ) (co,  Cl)  - (w,  Co)  (w,  Cl)  = 0 (4.52) 

Equations  (4.48)  and  (4.52)  can  be  solved  for  the  unknown  constants  a and  b in 
equation  (4.47).  Then,  w is  normalized.  Finally,  Lemma  (4.1)  guarantees  that  an 
orthogonal  multiresolution  analysis  will  be  obtained  if  it  is  ensured  that  cj)1  is  orthog- 
onal to  the  translates  of  w.  Towards  this  end,  the  Gram-Schmidt  procedure  is  applied 
once  again: 


0x(x)  = 4>l(x)  — (^>l , w(-  + 1)^  w(x  + 1)  — (&1,  w^j  w(x)  (4.53) 


The  function  is  then  normalized  and  we  define 
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ft 2 :=  ft2 
</>3  ;=  ft3 

04  :=  04 
05  :=  w 

Then,  the  final  orthogonal  multiresolution  analysis  {V j}jez  is  obtained,  where 

V0  :=  span  \fts(-  — k)  : s = l,  ...,5}  (4.54) 

l J fee z 

For  convenience,  the  double  tilde  notation  is  dropped  for  the  multiresolution  analysis 
defined  in  equation  (4.54).  The  generators  {ft1,  ft2,  ft3,  ft4,  ft5}  are  depicted  in  Figure 
(4.13).  The  approximation  spaces  {Vj}  are  then  defined  as 

Yj  '■=  span  : s = 1, . . . , 5}fceZ  (4.55) 

where,  as  before, 

<Fjtk(x)  = 2 j/2(j)s  (2jx  - k)  (4.56) 

The  scaling  filter  matrices  associates  with  these  generators  are  listed  in  Table  (4.4). 

It  now  remains  to  construct  a set  of  orthonormal  wavelets  associated  with  the 
above  orthogonal  multiresolution  analysis.  This  construction  follows  closely  the  pro- 
cedure used  to  derive  the  quadratic  wavelets.  As  in  the  quadratic  case,  a symmetric 
wavelet,  supported  over  [—1,1],  is  constructed: 

= 0l,o(z)  - (0liO,  01)  0X(z)  (4-57) 

This  function  is  orthogonal  to  the  scaling  functions  {01,  (j)2,  (f>3,  04  (j)5}  by  the  same 
argument  employed  in  the  quadratic  case.  Then,  yd  is  normalized  and  we  proceed 
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Scaling  Function  </>2 


Scaling  Function  03  Scaling  Function  04 


Scaling  Function  05 

Figure  4.13:  Generators  for  the  Final  Intertwining,  Orthogonal  Multiresolution 

Analysis 
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Table  4.5:  Piecewise- Cubic  Scaling  Filter  Matrices 


n 

[°n] 

-2 

0 

.0187976275001 

.00522145045609 

-.0611853406101 

-.0758597160812 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

-.123878397621 

.0187976275001 

-.00522145045609 

.311185340610 

.357152985811 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

.707106781187 

.0187976275001 

.00522145045609 

-.311185340610 

.357152985811 

0 

.618718433538 

-.233853586673 

.0883883476483 

.153093108924 

0 

.701560760020 

.0883883476483 

0 

0 

0 

.0883883476483 

-.701560760020 

0 

0 

0 

.224458784373 

.666550563194 

.0488842065142 

.0354900499726 

1 

-.123878397621 

.0187976275001 

-.00522145045609 

.0611853406101 

-.0758597160812 

.250000000000 

.618718433538 

.233853586673 

-.0883883476483 

.153093108924 

0 

-.701560760020 

.0883883476483 

0 

0 

0 

-.0883883476483 

-.701560760020 

0 

0 

.0579550089191 

.224458784373 

-.666550563194 

-.0488842065142 

.0354900499726 

to  the  construction  of  an  antisymmetric  wavelet  that  is  supported  over  [—1,1]-  As  in 
the  quadratic  case,  we  start  with  the  symmetric  function 


z(x)  = - (0\  4>\fl{x)  (4.58) 

Then,  an  antisymmetric  function  is  generated  by  midtiplying  z by  the  function  //, 
defined  in  equation  (4.41),  to  obtain 

ip2{x)  = H(x)z(x)  = H(x)  [(ft1  (x)  - <^l,o(®)]  (4-59) 

By  construction,  this  function  is  orthogonal  to  the  scaling  functions  {01,  </>2,  03,  <^4  05} 
and  their  translates.  Furthermore,  it  is  also  orthogonal  to  all  translates  of  the  sym- 
metric wavelet  i/d. 

Finally,  three  wavelets  that  are  supported  over  [0, 1]  are  constructed.  These 
wavelets  are  formed  as  linear  combinations  of  the  scaling  functions  that  span  V\  and 
have  support  contained  in  [0, 1].  Then,  these  wavelets  take  the  form 


^(x)  = acf>\fi{x)  + b(f>\  Q{x)  + c(p\  0(x)  + dxf>\fi{x)  + ecpj  ^x) 


(4.60) 
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+/0i,i(a:)  + #i,i(z)  + N>\,i (x)  + *0l,i (x)>  t = 3,4,  5 

Equation  (4.60)  contains  nine  constants  that  must  be  determined.  Because  the 
wavelets  must  be  constructed  such  that  they  are  orthogonal  to  the  translates  of 
the  scaling  functions  {01,  c/>2,  </>3,  04,  05},  we  have  the  following  six  orthogonality  con- 
straints for  t = 3,  4,  5: 


("0*,  0s)  = 0,  s = 1, . . . , 5 (4.61) 

- 1))  = 0 

Clearly,  we  have  the  freedom  to  choose  three  of  the  constants  for  each  of  the  wavelets 
in  equation  (4.60)  as  long  as  the  resulting  functions  are  linearly  independent.  The 
Gram-Schmidt  procedure  is  used  to  ensure  that  these  wavelets  are  also  mutually 
orthogonal.  In  this  case,  ip3  and  %/j5  have  been  chosen  to  be  symmetric  functions 
while  is  chosen  to  be  an  antisymmetric  function. 

We  now  have  a set  of  piecewise-cubic,  orthonormal  wavelets  {ip1,  ip2,  ip3,  ip4  ip5} 
whose  translates  form  a basis  for  the  orthogonal  complement  space  W0.  These 
wavelets  are  depicted  in  Figure  (4.14).  The  wavelet  spaces  {Wj}  are  defined  as 

Wj  :=  span  {i/jsj  k : s = 1, . . . , 5}fcgZ  (4.62) 

where,  as  usual, 

ipjtk(x)  = (2Jx  — k)  (4.63) 

The  wavelet  filter  matrices  associated  with  these  piecewise-cubic  multiwavelets  are 
listed  in  Table  (4.4). 
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Wavelet  ip1 


Wavelet  ip2 


Wavelet  ^ Wavelet  ^ 


Wavelet  ip 5 


Figure  4.14:  Piecewise-Cubic  Multiwavelets 
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Table  4.6:  Piecewise-Cubic  Multiwavelet  Filter  Matrices 


n 

[bn] 

-2 

0 

-.0187976275001 

-.00522145045609 

.0611853406101 

.0758597160812 

0 

-.0265838597511 

-.00738424605026 

.0865291385093 

.107281839320 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

.123878397621 

-.0187976275001 

.00522145045609 

-.311185340610 

-.357152985811 

.175190510001 

-.0265838597511 

.00738424605026 

-.440082529103 

-.505090596376 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

.707106781187 

-.0187976275001 

-.00522145045609 

.311185340610 

-.357152985811 

0 

.0265838597511 

.00738424605026 

-.440082529103 

.505090596376 

0 

-.124934166510 

.0260907284726 

-.305772920466 

0 

0 

0 

0 

.612372435696 

.353553390593 

0 

.219932720984 

-.0111360780385 

-.344936091180 

-.563072951901 

1 

.123878397621 

-.0187976275001 

.00522145045609 

-.0611853406101 

.0758597160812 

-.175190510001 

.0265838597511 

-.00738424605026 

.0865291385093 

-.107281839320 

.883417963408 

-.124934166510 

-.0260907284726 

.305772920466 

0 

0 

0 

0 

.612372435696 

-.353553390593 

-.175917764003 

.219932720984 

.0111360780385 

.344936091180 

-.563072951901 

4.5  Piecewise- Polynomial  Multiwavelets  of  Arbitrary  Order 
In  the  previous  two  sections,  the  construction  of  orthogonal,  piecewise-quadratic 
and  piecewise-cubic  multiwavelets  has  been  demonstrated  using  successive  intertwin- 
ing multiresolution  analyses.  This  procedure  can  be  applied  to  generate  higher-order, 
orthogonal  multiresolution  analyses  from  the  classical  (Lagrangian)  finite  element  ba- 
sis functions  of  appropriate  order.  The  following  theorem  can  be  stated  regarding  the 
construction  of  orthogonal,  piecewise-polynomial  multiwavelets  of  arbitrary  order: 

Theorem  4.1:  An  orthogonal  multiresolution  analysis  of  order  r,  r > 1,  can  be  gen- 
erated from  the  classical  (Lagrangian)  finite  element  spaces  of  the  same  order.  The 
scaling  functions  and  wavelets  that  correspond  to  the  resulting  multiresolution  analy- 
sis possess  the  following  properties: 

(1)  The  multiresolution  analysis  is  generated  by  r + 2 orthogonal  scaling  functions 
of  order  r,  minimally  supported  over  [—1,1].  One  scaling  function,  (f>1 , is  sym- 
metric about  the  origin  and  supported  over  [—1,1].  The  remaining  r + 1 scaling 
functions,  {(f)2, , (f>r+2},  are  supported  over  [0, 1]. 

(2)  There  are  r + 2 corresponding  orthogonal  wavelets,  minimally  supported  over 
[—1,1].  One  wavelet,  if)1,  is  symmetric  about  the  origin  and  supported  over 
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[—1, 1].  A second  wavelet,  ip2,  is  antisymmetric  about  the  origin  and  supported 
over  [-1, 1],  The  remaining  r wavelets,  {ip3, . . . , ipr+2},  are  supported  over  [0, 1], 
Proof:  (1)  The  classical  finite  element  spaces  of  order  r are  generated  by  r + 1 La- 
grangian  interpolation  functions  of  order  r.  Two  of  these  functions  are  nonzero  at 
the  element  boundary,  while  the  remaining  r — 1 “interior”  functions  are  zero  at  the 
boundaries.  When  these  spaces  are  viewed  as  a finite  shift-invariant  spaces,  the  two 
functions  that  are  nonzero  at  the  element  boundaries  can  be  assembled  into  one  sym- 
metric function,  spanning  two  elements.  If  the  length  of  the  standard  finite  element 
is  taken  as  one,  a shift-invariant  space  is  generated  by  one  function  supported  over 
[— 1, 1]  and  r — 1 interior  functions  supported  over  [0, 1].  As  discussed  earlier,  an  or- 
thogonal multiresolution  analysis  of  order  r can  be  generated  from  this  shift- invariant 
space  using  the  technique  of  successive  intertwinings.  Each  intertwining  involves  the 
construction  of  an  additional  scaling  function,  supported  over  [0, 1].  Therefore,  the 
final,  orthogonal  multiresolution  analysis  will  have  r + 2 generators,  one  supported 
over  [—1,1]  and  r + 1 supported  over  [0, 1]. 

(2)  Part  (2)  of  the  above  theorem  follows  as  a special  case  of  Lemmas  4.1,  4.2, 
and  4.4  [3].  The  following  argument  is  a simplified  form  of  the  more  general  proof 
provided  by  Donovan,  Geronimo,  and  Hardin  in  Ref.  [3].  First,  it  is  demonstrated 
that  a symmetric  wavelet  can  be  constructed  with  support  over  [—1,1],  Consider  the 
function 


i’1^)  = <f>\, oO)  - (0\  0i, o}  01OC)  (4-64) 

supported  over  [—1, 1],  Recall  from  equation  (4.56)  that  0]  0 is  a scaled,  contracted 
form  of  01.  Because  01  and  cp\  0 are  symmetric  about  the  origin,  so  is  ip1.  Clearly, 
ip1  £ Vi  since  (p\0  6 Vf  and  01  G V o C Ifi.  By  construction,  ip1  is  orthogonal  to 
(p1.  Because  0]  0 is  supported  over  [—4,  |],  and  the  scaling  functions  {02, . . . ,0r+2} 
are  supported  over  [0, 1],  it  is  clear  that  {02, . . . , 0r+2}  contain  no  component  of  (p\  0. 
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Therefore,  ip1  is  also  orthogonal  to  the  scaling  functions  {<p2, . . . , 0r+2},  and  it  can  be 
concluded  that  ip1  (E  V\  Q Vo-  Recall,  from  the  definition  of  the  wavelet  complement 
spaces, 


Wo  :=  © R0  (4-65) 

Clearly,  then,  ip1  G W0  and  we  have  proven  the  existence  of  a symmetric  wavelet  that 
is  supported  over  [—1,1]. 

Next,  it  can  be  shown  that  an  antisymmetric  wavelet  can  be  constructed  with 
support  over  [—1,1],  We  start  with  the  function 

z(x)  = (pl(x)  - (cp\  <f>\fi)  (p{0(x)  (4.66) 

The  symmetric  function  z is  orthogonal  to  the  scaling  functions  {(p2, ... , (pr+2}.  Fur- 
thermore, by  removing  the  contribution  of  (p\  0 to  01 , the  resulting  function  z passes 
through  the  origin.  An  antisymmetric  function  is  created  by  multiplying  z by  the 
function  H , defined  in  equation  (4.41).  We  obtain 

ip2(x)  = z(x)H(x ) (4-67) 

By  symmetry,  ip2  is  orthogonal  to  the  symmetric  functions  (p 1 and  ip1.  It  can  also 
be  shown  that,  by  construction,  ip2  is  also  orthogonal  to  the  translates  of  (p 1 and  ip1. 
Then,  it  can  be  concluded  that  ip2  E W0  and  we  have  proven  the  existence  of  an 
antisymmetric  wavelet  that  is  supported  over  [—1, 1]. 

Finally,  it  remains  to  be  shown  that  there  are  exactly  r wavelets  supported  over 
[0, 1].  First,  note  that  these  interior  wavelets  can  be  composed  only  of  functions  that 
are  contained  in  the  space  A(Vi).  Recall  that  A{\ i)  is  defined  as  the  space  spanned 
by  the  basis  functions  of  Vi  whose  support  is  contained  in  [0, 1].  Therefore,  A( V\)  is 
defined  as 
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A{Vi)  :=  span  { <f>\A 


. . ,0 


r+2 
1,0  i 


(4.68) 


and  the  dimension  of  A(\ i)  is  2r  + 3.  The  wavelets  must  be  orthogonal  to  the  scaling 
functions  that  span  .A(Vo),  defined  as  the  r + 1-dimensional  space 


A(Vq)  :=  span  {(j)2 , . . . , cf)r+2}  (4.69) 

In  addition,  the  wavelets  must  be  constructed  such  that  they  are  orthogonal  to  the 
translates  of  01,  the  scaling  function  supported  over  [—1, 1].  We  define  the  spaces 


Qo 

:=  span  {(f)1  X[o,i]} 

(4.70) 

Q\ 

:=  span  {01(-  — l)X[o,i]} 

(4.71) 

That  is,  Q0  and  Q\  are  the  one-dimensional  spaces  spanned  by  the  restrictions  of  01 
and  01(-  — 1)  to  [0, 1].  Defining  to  be  the  space  spanned  by  the  wavelets  that  are 
supported  over  [0, 1],  we  have 


¥ = A(Vi)  0 A(V0)  QQoQQi  (4.72) 

since  4T  C A{V\)  and  must  be  orthogonal  to  the  spaces  A(Vq),  Qo,  and  Q\.  The 
dimension  of  T'  is  then  calculated  to  be 


dim( d'1)  = dim(A(Vi))  — dim(A(y0))  — dim(Q0)  — dim(Qi) 
= (2r  + 3)  — (r  + 1)  — 1 — 1 


Therefore,  is  an  r-dimensional  space  and  there  must  be  r wavelets  supported  over 
[0, 1].  This  completes  the  proof. 
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The  above  theorem  guarantees  that  we  can  generate  an  orthogonal  multireso- 
lution analysis  of  arbitrary  order  greater  than  one.  First-order,  linear  multiwavelets 
are  an  exception  to  the  theorem  because  the  resulting  orthogonal  multiresolution 
analysis  has  four  scaling  functions  and  wavelets,  while  the  theorem  would  imply  that 
there  are  only  three.  The  extra  scaling  function  and  wavelet  are  needed  because 
three  dimensions  are  not  enough  to  construct  an  orthogonal  multiresolution  analy- 
sis from  the  classical  linear  finite  element  basis  functions.  As  discussed  earlier,  this 
construction  has  been  carried  out  by  Donovan,  Geronimo,  and  Hardin  [2],  In  the  pre- 
vious two  sections,  the  applicability  of  the  above  theorem  has  been  demonstrated  for 
piecewise-quadratic  and  piecewise-cubic  multiwavelets.  The  scaling  functions  are  gen- 
erated using  the  technique  of  successive  intertwining  multiresolution  analyses,  while 
the  wavelets  are  constructed  according  to  the  procedure  outlined  in  the  above  proof. 
The  significance  of  the  theorem  is  that  these  methods  can  be  applied  to  generate 
orthogonal  multiresolution  analyses  of  arbitrary  approximation  order. 

4.6  Boundary-Adapted  Multiwavelets 

It  is  not  difficult  to  adapt  the  piecewise-polynomial  multiwavelets  that  have  been 
derived  from  classical  finite  element  spaces  to  finite  boundaries.  This  is  necessary  so 
that  the  multiwavelets  can  be  used  to  approximate  functions  that  are  supported  over 
finite  domains.  In  particular,  this  will  prove  useful  later  when  these  multiwavelets 
are  employed  in  the  representation  of  first  and  second-order  Volterra  kernels,  which 
have  finite  support. 

As  an  example,  consider  the  case  in  which  a wavelet  basis  is  needed  for  L2[0, 1], 
the  space  of  square-integrable  functions  supported  over  [0,1].  This  case  can  easily 
be  generalized  to  other  one-dimensional  domains.  In  this  example,  the  piecewise- 
linear  multiwavelets  are  used  although  the  results  are  applicable  to  the  higher-order 
multiwavelets  as  well.  In  constructing  a basis  for  L2[0, 1]  using  the  piecewise-linear 
multi  wavelets,  recall  that  there  are  a total  of  four  scaling  functions  {(f)1,  02,  03,  </>4} 
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defined  on  level  0.  The  function  01  is  supported  over  [—1,1]  while  the  functions 
{02,03,04}  are  all  supported  over  [0,1].  The  space  Vo  C L2(R)  is  defined  as  the 
span  of  all  integer  translates  of  these  four  scaling  functions.  In  constructing  a basis 
for  Vo[0, 1]  C L2[0, 1],  note  that  there  are  a total  of  five  functions  that  intersect  the 
domain  [0, 1],  as  shown  in  Figure  (4.15).  From  the  figure,  it  is  clear  that  the  functions 
01  and  (j)l{ ■ — 1)  both  overlap  the  boundaries.  The  functions  01X[o,i]  an<4  0*('  — l)X[o,i] 
are  defined  as  the  restrictions  of  these  functions  to  the  domain  [0, 1]: 


01(z)X[o,i](z) 
4>l{x  - l)x[o,i](z) 


0J( X ) 

x e [o,  l] 

(4.73) 

0 

otherwise 

01(a;  — 1) 

x G [0, 1] 

(4.74) 

0 

otherwise 

where  X[o,i]  is  the  characteristic  function  over  [0, 1].  These  functions  are  depicted  in 
Figure  (4.16).  Then,  the  space  Vq[0,  1]  is  defined  as 


Vo[0, 1]  :=  span  {01X[o,i]i  02>  03>  04>  01('  — l)X[o,i]}  (4-75) 

In  a similar  manner,  on  resolution  level  1 there  are  a total  of  nine  scaling  functions 
that  intersect  the  domain  [0, 1].  These  functions  are  shown  in  Figure  (4.17)  where, 
as  in  previous  sections,  we  have  the  notation 


0*fc(x)  = 2j/2<t>s(2jx  - k)  (4.76) 

The  functions  0]  0 and  0]  2 overlap  the  boundaries.  Restricting  these  functions  to 
[0, 1],  the  space  Vi  [0, 1]  is  defined  as 


Vi [0, 1]  span  { 0i,oX[o,i] 5 0i,o>  0i,o>  0i,o>  0i,i>  0i,i>  0i,i>  0i,i>  0i,2X[o,i]}  (4-77) 
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Figure  4.15:  Level  0 Scaling  Functions  That  Intersect  [0,1] 


Figure  4.16:  Boundary- Adapted  Scaling  Functions 
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Functions  That  Intersect  [0,1] 


In  general,  the  space  Vj[0, 1]  has  dimension  4 • 2^  + 1 and  is  defined  as 


Vj[0, 1]  :=  span  {0j)OX[o,i],  (t>lk\lL0\  <l>lk\k=o\  ^X[o,i] } (4-78) 


In  constructing  the  spaces  {V^O,  1]}  that  form  a multiresolution  analysis  for 
Z/2 [0, 1] , every  scaling  function  that  intersects  [0,1]  is  included  in  the  basis.  If  the 
function  overlaps  the  boundary,  it  is  simply  restricted  to  [0, 1]  and  included  in  the  ba- 
sis. This  simple  procedure  cannot  be  used  for  the  wavelet  spaces  {Wj[0, 1]},  however. 
To  demonstrate  this,  consider  the  space  IFo[0, 1],  defined  as 
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Wo[0,l]  :=  ^i[0, 1]  © Vo[0, 1]  (4.79) 

This  space  is  constructed  using  the  four  piecewise-linear  wavelets  {ip1,  ip2,  ip3,  ip4}. 
The  wavelets  ip 1 and  ip2  are  supported  over  [—1,1]  while  the  wavelets  ip3  and  ip 4 
are  both  supported  over  [0, 1].  As  shown  in  Figure  (4.18),  there  are  a total  of  six 
wavelets  that  intersect  the  domain  [0, 1].  The  wavelets  {ip1,  ip2,  ip1  {■  — 1 ),ip2{-  — 1)} 
all  overlap  the  boundaries.  Recall  that  the  dimension  of  the  space  Vo[0, 1]  is  five  while 
the  dimension  of  Vi[0, 1]  is  nine.  Then,  equation  (4.79)  implies  that 

dim  (VF0[0, 1])  = dim  ( Vj [0, 1])  — dim  (Vo[0, 1])  = 4 

Therefore,  the  space  VFo[0, 1]  cannot  simply  be  defined  as  the  span  of  the  six  wavelets 
{V'1X[oIi],^2X[o1i],^3»V’4»V’1(--l)X[o.i].V,2(--l)X[o1i]}-  Indeed,  equation  (4.79)  implies 
that  these  wavelets  do  not  form  a linearly  independent  set  because  two  of  them  are 
redundant.  In  determining  which  two  functions  to  exclude  from  the  basis  for  Wo[0, 1], 
recall  that  the  spaces  Vo[0, 1]  and  W0[0, 1]  are  required  to  be  orthogonal.  The  anti- 
symmetric wavelet  ip2  is  automatically  orthogonal  to  the  symmetric  scaling  function 
(p1.  This  does  not  imply,  however,  that  the  individual  halves  of  these  functions  are 
orthogonal.  That  is,  in  general, 


('^2X[o,i],(p1X[o,x})  ^ 0 
(ip2{-  - 1)X[0,1], 01(-  - 1)X[0,1])  ± 0 

Therefore,  ip2X[ o,i]  and  ip2( • — l)X[o,i]  should  not  be  included  in  the  VFo[0, 1]  basis.  On 
the  other  hand,  by  construction,  since  the  functions  ip1  and  01  are  orthogonal,  and 
both  are  symmetric  over  [—1,1],  they  must  satisfy 
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(,01X[o,i]5  01X[o,i])  = 0 
(V’H-  - i)x[o,i],  0X(-  - i)x[o,i] ) = 0 

Therefore,  the  wavelets  {ip1,  ip3,  ip 4,  ■01(-  — 1)}  form  an  orthogonal  set  and  are  orthog- 
onal to  the  scaling  functions  that  span  Vo[0, 1].  The  space  lTo[0, 1]  is  then  defined 
as 


W0[0, 1]  :=  span  {ip1  X[o,i], 'ip3,'fp4,'fp1{- ~ l)X[o,i]}  (4-80) 

It  is  important  to  note  that  the  antisymmetric  wavelet  ip2  is  not  excluded  from  the 
finer-resolution  wavelet  spaces.  For  example,  the  space  Wi[0, 1]  is  defined  as 


Wi[0, 1]  :=  spon{^}i0X[o,i],V'?1o.^lo>^i,i»^i1iiV’iIii^i1i.^iI2X[o,i]}  (4-81) 


where,  as  usual, 


ipl k = 2j'2ips(2j  - k ) (4.82) 

The  wavelets  i/>ij0X[o,i]  and  V’ i2X[o,i]  intersect  the  boundary  and  are  excluded  from 
the  VFi[0, 1]  basis,  but  the  “interior”  wavelet  ipf  ± is  needed  in  the  basis.  In  general, 
the  space  ITj[0, 1]  has  dimension  4 • 2J  and  is  defined  as 


Wj[ 0, 1]  :=  span  {^},oX[o,i],  ^},k\k=i 


2^-1 
k= 1 


2^  — 1 
fc=0  ) 


23-1 
fc= 0 


^j,23X[o,i]}  (4-83) 


The  extension  of  the  foregoing  results  to  higher  dimensions  is  relatively  straight- 
forward. In  higher  dimensions,  the  scaling  functions  and  wavelets  are  formed  as 
tensor  products  of  the  one-dimensional,  boundary-adapted  functions  using  the  same 
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Figure  4.18:  Level  0 Wavelets  That  Intersect  [0,1] 


procedure  that  was  described  in  Section  3.6.  Defining  T as  the  [0, 1]  x [0, 1]  square 
domain,  the  coarsest-scale  space  Vo(T)  is  spanned  by  a total  of  25  two-dimensional 
linear  scaling  functions,  obtained  as  the  tensor  products  of  the  five  one-dimensional 
scaling  functions  in  the  first  dimension  (x-direction)  with  those  in  the  second  dimen- 
sion (y-direction).  Similarly,  the  wavelet  spaces  IT^F)  and  VF02(r)  are  each  spanned 
by  a total  of  20  two-dimensional  wavelets.  These  are  obtained  as  the  tensor  products 
of  the  five  scaling  functions  in  the  x-direction  with  the  four  wavelets  in  the  y-direction 
for  W^T)  and  the  four  wavelets  in  the  x-direction  with  the  five  scaling  functions  in 
the  y-direction  for  W^r).  Finally,  the  wavelet  space  Wq  (T)  is  spanned  by  16  wavelets 
formed  as  the  tensor  products  of  the  four  wavelets  in  the  x-direction  with  those  in 
the  y-direction.  In  general,  the  space  Vj(T)  is  spanned  by  (4 • 2^ -|- 1)2  two-dimensional 
scaling  functions  while  the  wavelet  spaces  W^(T),  Wj(T),  and  VF^r)  are  spanned 
by  (4  • 2J  + 1)(4  • 23),  (4  • 2J’)(4  • 23  + 1),  and  (4  • 23)2  wavelets,  respectively.  The 
generalization  to  three  dimensions  and  higher  follows  in  the  same  manner. 

As  a final  note,  although  the  above  example  specifically  uses  the  piecewise-linear 
multiwavelets,  the  procedure  for  deriving  boundary-adapted  scaling  functions  and 
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wavelets  directly  generalizes  to  the  higher-order  multiwavelets.  This  is  because,  re- 
gardless of  the  order  of  the  scaling  functions  used,  there  is  always  exactly  one  sym- 
metric scaling  function  supported  over  [—1,1]  while  all  the  others  are  supported  over 
[0, 1].  Similarly,  there  is  always  exactly  one  symmetric  and  one  antisymmetric  wavelet 
supported  over  [—1, 1]  while  the  others  are  all  supported  over  [0, 1].  Therefore,  regard- 
less of  the  order  of  the  multiwavelets,  there  are  always  the  same  number  of  functions 
that  overlap  the  boundary,  and  the  procedure  for  dealing  with  these  functions  does 
not  change.  The  only  difference  is  that  as  the  order  is  increased,  there  are  additional 
functions  supported  over  [0, 1]  and  consequently  more  total  functions. 

4.7  Multiwavelet  Decomposition  Example 
To  complete  the  discussion  of  multiwavelets,  the  piecewise- linear,  quadratic,  and 
cubic  multiwavelets  are  used  to  decompose  the  function  shown  in  Figure  (4.19).  Recall 
that  the  Haar  wavelet  transform  was  demonstrated  on  this  same  function  in  Section 
3.3.  The  function  is  a .5  Hz  sine  wave  over  the  domain  [0, 1]  with  a sharp  feature  in  the 
form  of  a 32  Hz  sine  component.  Because  the  function  is  supported  on  a finite  domain, 
the  method  described  in  the  previous  section  is  used  to  adapt  the  multiwavelets  to  the 
boundaries.  The  decomposition  of  this  function  using  the  piecewise- linear,  quadratic, 
and  cubic  multiwavelets  is  shown  in  Figures  (4.20)  through  (4.22).  In  all  three  cases, 
the  original  fine-scale  representation  of  the  function  is  taken  at  level  8 and  the  coarsest 
level  is  chosen  to  be  level  0.  Therefore,  the  multiscale  wavelet  representation  of  the 
function  is  given  in  terms  of  the  scaling  function  approximation  on  level  0 and  the 
wavelet  details  on  levels  0 through  7.  Note  that,  on  the  finer-resolution  levels,  the 
wavelet  details  are  nonzero  only  where  the  sharp  features  of  the  function  occur.  On  the 
coarser  scales,  there  are  no  details  from  these  sharp  features  and  the  wavelet  details 
focus  on  the  low  frequency  component  of  the  signal.  The  effect  of  the  approximation 
order  of  the  multiwavelets  is  also  evident  from  the  figures.  As  the  approximation 
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Figure  4.19:  Original  Function 


order  increases,  the  scaling  functions  yield  better  approximations  of  the  function  at 
any  given  level. 
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Figure  4.20:  Piecewise-Linear  Multiwavelet  Decomposition 


167 


Figure  4.21:  Piecewise-Quadratic  Multiwavelet  Decomposition 
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Figure  4.22:  Piecewise- Cubic  Multiwavelet  Decomposition 


CHAPTER  5 

WAVELETS  SUPPORTED  ON  TRIANGULAR  DOMAINS 


In  Chapter  2,  it  was  demonstrated  that  when  the  dependent  variable  of  a time- 
invariant,  nonlinear  ODE  is  expressed  in  terms  of  a Volterra  series,  the  second-order 
Volterra  operator  takes  the  form 


Volterra  kernel  /12  over  the  triangular  domain  defined  by  0 < £ < r]  and  0 < ?/  < t. 
Therefore,  in  many  cases,  it  is  natural  and  convenient  to  use  the  triangular  form  of 
the  kernel  instead  of  the  symmetric  form  over  the  [0,  t]  x [0,  t]  square  domain.  It 
is  assumed  that  the  kernel  is  a decaying  function  and  that,  for  some  time  D,  the 


influenced  by  inputs  occurring  further  back  in  time  than  D (i.e. , the  system  has  finite 
memory).  Then,  the  second-order  kernel  is  supported  over  the  triangular  domain  D 
shown  in  Figure  (5.1). 


In  order  to  develop  wavelet-based  representations  of  the  triangular  second-order 
Volterra  kernel,  it  is  necessary  to  derive  a multiresolution  analysis  over  the  domain 
fb  The  construction  of  wavelets  that  are  supported  on  compact  subsets  of  Md,  where 
d is  any  positive  integer,  has  been  addressed  in  a series  of  papers  by  Micchelli,  Xu, 
and  Chen  [4],  [5],  [101].  In  Ref.  [4],  a procedure  is  outlined  whereby  a multiresolution 


(5.1) 


Equation  (5.1)  shows  that  the  second-order  operator  is  dependent  on  the  second-order 


kernel  is  essentially  zero  for  t > D.  This  implies  that  the  output  of  the  system  is  not 


5.1  Construction  of  Triangular  Wavelets 
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Figure  5.1:  Triangular  Domain  of  Support  for  the  Second-Order  Volterra  Kernel 

analysis  can  be  constructed  for  a compact  set  E cRd  for  which  there  exists  a family 
of  m contractive  affine  mappings 

7i  : Rd — >Rd,  i = 0, 1, m — 1 

such  that 


m— 1 

£=U7i(fi)  (5.2) 

i=0 

and 

meas  (7.(£)fb(£))  =°.  i + i (5-3) 

where  meas  (A)  denotes  the  Lebesque  measure  of  a set  A C A set  E that 

satisfies  these  conditions  is  termed  an  invariant  set  relative  to  the  family  of  mappings 

F :=  (7i  : * = 0, m — 1}. 
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The  method  detailed  in  Ref.  [4]  can  be  applied  to  the  construction  of  a multires- 
olution analysis  over  the  triangular  domain  f2  C R2.  First,  we  define  four  contractive 
affine  mappings 


7* 


i — 0, . . . , 3 


which  act  on  any  point  (£,77)  G R2  as  follows: 


7o(£,?7)  = 

71  (£,??)  = 

72  (£,??)  = 

73(^,h)  = 
These  mappings  transform  the 


C R2  into  the  four  subsets 


(5.4) 


(5.5) 


(5.6) 


(5.7) 


0,  :=  7,(-i),  i = 0, ...  ,3 
shown  in  Figure  (5.2).  Clearly, 

3 

= (J^i 

1=0 

and  the  subsets  hh  do  not  intersect, 

meas  (^li  n«i)=o, 
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Then,  it  follows  that  Q is  an  invariant  subset  of  R2  relative  to  the  family  of  mappings 
T :=  {7i  : i = 0, . . . , 3} . The  mappings  in  T are  invertible  and  it  can  easily  be 

verified  that  the  inverse  mappings 


i = 0, . . . , 3 


are  given  by 


7o_1  (£,??) 

7fX  (£,  v) 
7a1  (£>h) 
73“ 1 (f , v) 


(5.8) 

(5.9) 

(5.10) 

(5.11) 


The  first  step  towards  generating  a multiresolution  analysis  over  fl  is  to  find  an 
n-dimensional  refinable  curve  / G L2(Q).  This  curve  defines  a subspace  Vq  C 
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Vb  :={//:  ceRn}  (5.12) 

The  components  fi , z = l,...,n,  represent  basis  functions  for  the  subspace  Vo-  For 
a given  set  E,  a vector  field  fit),  t G E,  is  refinable  if  there  exist  n x n matrices 
[An],  •••,  [Am- 1]  such  that 

GJJt)  = [Aifm,  i = 0, ...,  m — 1 (5.13) 

and 

VTf(t)=a  (5.14) 

where  wGl"  and  a is  a nonzero  constant  [5].  In  equation  (5.13),  Gi  , i = 0, . . . , m — 1, 
represent  operators  defined  as 

Gifit)  :=  / °7i(t)  = / (7i(i))  (5-15) 

so  that 


r gj i (t) 


fi  0 7 i (<) 


GJ  (t)  = 


i = 0, . . . , m — 1 


GJn  (t)  J l fn  ° li  (t) 

A simple  choice  of  a refinable  curve  in  L2(fl)  is  the  one-dimensional  vector  / = xn> 

where  xn  is  the  characteristic  function  of  the  set  f h defined  as 


Xn  (Gv)  ■= 


(5.16) 


l (^,  »])en 

__  0 otherwise 

To  show  that  xci  is  a refinable  curve  in  L2( ff),  it  must  be  demonstrated  that  equations 


(5.13)  and  (5.14)  are  satisfied  for  any  point  (£,77)  € fl.  In  this  case,  equations  (5.13) 
and  (5.14)  take  the  form 
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GiXn(£,y)  = AiX  n(£,v),  i = 0,  — ,3  (5.17) 

vxUGv)  = a (5.18) 

where  A;,  i = 0, . . . , 3,  and  v are  now  nonzero  constants.  Using  equation  (5.15), 

GiXnit,  v)  = Xn°  7 »(f , v)  = Xn  (t i(C, ??)) , * = 0, ....  3 

Clearly,  any  point  (£,77)  G U subject  to  the  mapping  7 * will  be  mapped  into  the 
set  Uj.  Since  h;  C h for  i = 0, 1,  2,  3,  and  using  the  definition  of  the  characteristic 
function  xn  given  in  equation  (5.16),  it  follows  that 

GiXnit  V)  = Xn  (7 '»(£,  v))  = l,  * = 0, . . . , 3 

for  all  (£,  77)  G Q.  Then,  equation  (5.17)  is  satisfied  by  choosing  A*  = 1,  i = 0, . . . , 3. 
Equation  (5.18)  is  trivially  satisfied  by  choosing  v = 1.  Then,  a = 1 for  all  (£,  77)  G fl. 

Now  that  it  has  been  established  that  xn  represents  a refinable  curve  in  L2(fl), 
the  subspace  Vq  C L2(fl)  is  defined  as 


Uo:={clXn:  <h  G K}  (5.19) 

That  is,  Vo  is  the  one-dimensional  vector  space  spanned  by  the  characteristic  function 
Xn ■ In  Ref.  [4],  it  is  shown  that,  given  the  coarsest-scale  space  Vo,  a sequence  of  nested 
spaces 


Vj+i^Vv  j = 0,1,2,... 

can  be  constructed.  This  result  is  achieved  via  the  following  family  of  linear  operators 
on  L2(f2): 
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3 

(Tig)^,v)  = Y^^9(/yi1{^v))xnj(^v),  * = 0, . . . , 3 (5.20) 

i=o 

In  equation  (5.20),  g represents  any  function  in  L2(fl)  and  xa,  denotes  the  charac- 
teristic function  over  the  set  Clj.  The  {Qij}?’J=0  are  the  entries  of  a 4 x 4 orthogonal 
matrix  [Q].  Because  [Q\  is  orthogonal,  it  satisfies  the  condition 

[Q]T[Q]  = [I]  (5.21) 

where  [I]  is  the  4x4  identity  matrix.  Using  the  operators  {71*}  in  equation  (5.20), 
the  liner-resolution  spaces  {V)+i}  are  constructed  from  the  coarser-resolution  spaces 
{Vj}  via  the  recurrence  formula 

3 

Vj+i  :=®TlUJ,  j = 0,1,2,...  (5.22) 

i= o 

where  the  symbol  0 denotes  the  sum  of  spaces.  The  orthogonality  of  [Q]  ensures 
that  the  spaces  {TiVj}  do  not  intersect: 

3 

n t‘v> = ° 

i=0 

In  applying  equation  (5.22),  we  first  choose  the  matrix  [Q\  to  be  the  4x4  identity 
matrix.  Then,  the  operators  in  equation  (5.20)  take  the  simplified  form 

{Tig)(£,v)  = 9(ir1(Z>v))xsii{£>'n),  f = o,  ...,3  (5.23) 

In  simple  terms,  the  operators  {T)}  restrict  functions  in  Q to  the  subsets  {Uj}  C fl  in 
the  same  manner  as  the  mappings  {qj}  map  points  in  Q into  {fh}.  Equation  (5.22) 
can  now  be  used  to  construct  the  subspace  V\  from  V0: 


Vi  — To Vq  © 7\Vo  © T2V0  © T3V0 


(5.24) 
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Recall  that  Vo  = span{xn}-  The  operator  T0  in  equation  (5.23)  acts  on  xn  in  the 
following  manner: 


(ToXn)(Z,v)  = Xnbo\^v))XQ0(^v)  (5-25) 

The  function  xn0  in  equation  (5.25)  requires  that  (£,77)  € T20  in  order  for  {ToXn)(£,v) 
to  be  nonzero.  The  operator  q^1  maps  any  (£,77)  € T20  into  the  larger  set  12.  Then, 


Xci  (7o  1(€iV))  — 1 V(£,77)el20 

Thus,  we  have 


(ToXn)(Z>v)  = Xcioitv) 

In  a similar  manner,  we  obtain 


(5.26) 


(Tixn)(^,  77) 

= XQ  1(£,rj) 

(5.27) 

(T2Xn)(€,ri) 

= Xn2(Z,v) 

(5.28) 

(T3Xn)(£,v) 

= Xn3(t,v) 

(5.29) 

Then,  the  space  Vi  is  defined  as  the  vector  space  spanned  by  the  characteristic  func- 
tions of  the  sets  {12^}  shown  in  Figure  (5.2): 


Vi  :=  span{xcii  ■ i = 0, . . . , 3}  (5.30) 

It  should  be  noted  that  dim(V\ ) = 4 and  that,  in  general,  dim(Vj)  = 4J.  Noting  that 

3 

Xn  (Z,V)  = ^2xtu  {€,V) 

i=0 


177 


it  is  clear  that  Vo  C V\  since  any  function  in  Vo  can  also  be  expressed  in  terms  of  the 
basis  functions  of  V\. 

Now  that  the  space  V\  has  been  generated  from  Vo,  equation  (5.22)  can  be  used 
again  to  construct  the  space  V2  from  V\. 

V2  = ToK  © TiVi  © T2V i © T3V:  (5.31) 

The  basis  functions  for  the  space  T0Vi  are  obtained  by  applying  the  operator  To  to 
the  basis  functions  of  V\.  For  example, 


(Toxn0)(^v)  = Xct0  (bo  *(£> v))  Xn0(£> v) 

It  is  not  difficult  to  verify  that  (T0xn0)(^,  77)  is  equal  to  one  if  (£,77)  6 floo  and  zero 
otherwise,  where  floo  is  the  subset  of  Q shown  in  Figure  (5.3).  Therefore, 

(7oXn0)(£,  v)  = Xn00(£)  v)  (5-32) 

where  Xriooi^v)  ls  the  characteristic  function  of  the  set  fl0o-  Following  the  same 
argument,  in  general  we  have 


(TiXsijK&v)  = X^jirti  \Z,*l))xai{t,v)  = Xnii{£,'n),  i,3  e {0, 1,2,3}  (5.33) 

Therefore,  V2  is  the  16-dimensional  subspace  spanned  by  the  characteristic  functions 
over  the  sets  f \j  shown  in  Figure  (5.3).  That  is, 

V2  = span  {xciij '■  i,  j e {0, 1, 2, 3}}  (5.34) 

By  inspection,  it  is  clear  that  V}  C V2.  Equation  (5.22)  can  be  applied  recursively  to 
obtain  spaces  {V}}  C L2(fl)  of  increasingly  finer  resolution.  In  Ref.  [4],  it  is  proven 


that 
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Figure  5.3:  Subsets  fly  C Cl 


U Vj  = L2  (Cl)  (5.35) 

3=0 

meaning  that  as  j approaches  infinity,  the  spaces  { Vj } approach  L2(Cl). 

To  this  point,  a series  of  nested  subspaces  {Vj}  C L2(Cl ) has  been  generated 
using  the  technique  of  Micchelli  and  Xu  [4].  The  basis  functions  for  these  spaces  are 
the  scaling  functions,  or  generators,  of  the  multiresolution  analysis.  It  now  remains 
to  construct  the  orthogonal  complement  spaces  {W 3}  such  that 

Wj^Vj^QVj,  j = 0,1,2,...  (5.36) 

where  the  spaces  Vj  and  W3  are  orthogonal.  Before  doing  this,  it  is  convenient  to  nor- 
malize the  scaling  functions  that  form  the  bases  for  the  various  spaces  {Vj}.  Working 
with  orthonormal  bases  simplifies  the  decomposition  and  reconstruction  formulas  that 
will  be  derived  later.  Towards  this  end,  we  start  with  the  subspace  Vo,  which  is  the 
one-dimensional  space  spanned  by  the  characteristic  function  over  fl,  xn-  Denoting 
Aq  to  be  the  area  of  the  triangular  domain  Cl,  it  easy  to  verify  that  the  function 
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0(f  > ri)  ■=  ~i=Xn(f , v)  (5-37) 

v -^o 

is  an  orthonormal  basis  for  Vo: 

(0.  0)l2(«)  = J 4?  dkl  = ^ j dkl  = l 

Q Q 

From  this  point  forward,  the  subspace  Vo  will  be  defined  as 

Vo  :=  span{(J)}  (5.38) 

To  derive  orthonormal  scaling  functions  for  the  other  spaces  {V)},  j > 1,  we  first 
note  that  the  basis  functions  for  these  subspaces  are  already  orthogonal  since  they 
are  composed  of  characteristic  functions  that  do  not  intersect.  Following  the  same 
argument  as  for  the  basis  function  of  Vo,  the  basis  functions  of  each  space  V)  will 
be  normalized  if  they  are  scaled  by  a factor  of  A • ' , where  A:)  is  the  area  of  the 
domain  over  which  the  functions  are  supported.  At  this  point,  it  is  convenient  to 
introduce  some  new  notation  for  the  scaling  functions  that  span  the  spaces  {V,  }.  Each 
scaling  function  can  be  parameterized  by  two  indices,  j and  K,  where  j represents 
the  resolution  level  of  the  function  and  K represents  a j-dimensional  multi-index 
K = kjkj-i...ki,  where  /q  £ {0, 1,2,3},  i = The  multi-index  K gives  the 

sequence  of  operators  7}  that  transforms  the  characteristic  function  over  Q into  the 
given  function.  For  example,  the  function  y^01  is  obtained  as 

Xn0i  (£>  v)  = TioXn0(£,v)  =T1oT0oXn(£,v) 

Therefore,  yn01  is  parameterized  by  the  indices  j = 2 and  IK  = 01  and  can  be  written 
as  X2,oi  ■ In  general,  noting  that 

Aj  = 2-2M0  =►  -)=  = y 
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the  orthonormal  scaling  functions  are  defined  as 


k{&v) 


Xj, k{Z,v)  = 2 3{Tk  O Tk 


"O  Tkl  O </>(£,  Tj)) 


(5.39) 


For  convenience,  A j is  defined  as  the  set  of  all  possible  multi-indices  of  length  j,  where 
each  member  of  the  multi-index  is  an  element  of  the  set  {0, 1,  2,  3}.  Then,  the  spaces 
{Vj}  are  defined  as 


vi  :=  sPan  (5-40) 

Now  that  we  have  generated  orthonormal  scaling  functions  that  span  the  spaces 
{Vj},  the  wavelets  that  span  the  orthogonal  complement  spaces  {Wj}  can  be  con- 
structed. Micchelli  and  Xu  [4]  have  shown  that,  once  a basis  for  the  coarsest-scale 
wavelet  space  Wq  is  obtained,  bases  for  the  finer-resolution  spaces  {Wj}  can  be  gen- 
erated through  a recursive  application  of  the  operators  {T)}.  With  this  in  mind,  a 
wavelet  basis  for  the  orthogonal  complement  space  VF0  that  satisfies 

Wo  = V1e  Vo  (5.41) 

where  the  basis  functions  of  W0  are  orthogonal  to  the  basis  function  of  Vo,  must  be 
constructed.  Recall  that 


Vi  = span  {<£ii0,  0i,i,  0i,2,  01,3} 

where  the  scaling  functions  are  defined  as  in  equation  (5.39).  Since  dim(V i)  = 4 and 
dim(y0 ) = 1,  we  must  have  dim(Wo)  = 3.  Therefore,  three  wavelets  {xp1  ,'ip2 ,i^3} 
must  be  constructed  that  are  orthogonal  to  the  scaling  function  0 that  spans  Vo  and 
satisfy  equation  (5.41).  In  addition,  the  wavelets  will  be  constructed  to  be  mutually 
orthonormal  so  that  they  satisfy 
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(0r,  0s)L2(n)  = J 0r(£>  V )0S(£,  v)dtt  = <5r,s  (5.42) 

n 

and  form  an  orthonormal  basis  for  Wq.  In  order  to  ensure  that  equation  (5.41)  is 
satisfied,  the  three  wavelets  must  be  linear  combinations  of  the  four  scaling  functions 
that  span  V\. 


3 

Tpr(Z,v)  = ?"  = 1,2, 3 (5.43) 

i=0 

It  is  easy  to  verify  by  inspection  that 


0(£,  V)  = ^0i,o(f , V)  + ^0i,i(£>  V)  + ^0i, 2(^,  h)  + ^0i,3(f , h)  (5-44) 

Equations  (5.43)  and  (5.44)  can  be  combined  into  the  following  matrix  form: 


' (p  ' 

r 1 
2 

1 

2 

1 

2 

1 

2 

' 01,0  ' 

Ip1 

> = 

C0 

cl 

4 

4 

< 

01,1 

Ip2 

C0 

C1 

c2 

4 

01,2 

„ Ip3 , 

-4 

Cl 

4 

eg- 

- 01,3  > 

The  coefficients  {c[}  must  be  chosen  in  such  a manner  that  the  coefficient  matrix 
in  equation  (5.45)  is  of  full  rank.  Then,  the  functions  {(p,  ip1,  'ip2,  ip3}  form  a linearly 
independent  set.  If  the  coefficients  are  chosen  such  that 


{co,  c\,  cl,  c\}  = {a,  —a,  0,  0} 

{4,  cj,  4, 4}  = {o,o ,/?,-/?} 
{4AAA}  = (7, 7, -7, -7} 


the  coefficient  matrix  takes  the  form 
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Figure  5.4:  Wavelets  That  Form  a Basis  for  Wq 


r 1 
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1 

2 


1 

2 


1 

2 


a —a  0 0 


0 0 0-/3 

_A  A -A  -A. 


(5.46) 


Clearly,  the  coefficient  matrix  shown  in  equation  (5.46)  is  of  full  rank,  regardless  of 
the  values  of  cc,  (3,  and  A.  The  wavelets  then  take  the  form  shown  in  Figure  (5.4), 
where  the  regions  are  marked  ©,  ©,  or  0 to  show  where  the  functions  are  positive, 
negative,  or  zero.  By  inspection,  it  is  clear  that  the  wavelets  are  all  orthogonal  to  the 
scaling  function  0,  which  is  a positive  constant  over  the  entire  domain  12.  Therefore, 
these  wavelets  form  a basis  for  the  orthogonal  complement  space  W0-  Referring  again 
to  Figure  (5.4),  it  is  apparent  that  the  wavelets  are  mutually  orthogonal.  The  only 
remaining  task  is  to  calculate  values  for  a,  (3,  and  A such  that  the  wavelets  are 
orthonormal.  The  following  equations  must  be  satisfied: 


/ 

[fp1  (€,v)]2dtt  = 

- a2  / [0 i,o 
J 

+ a2  1 

[0i,i 

(£,7?)]2dQ 

J 

n 

Ho 

J 

Qi 

f 

[v>2  (£,  h)]  2dl2  = 

= P2  [ [01.2 

J 

(^ri)]2dfl2 

+ P2  / 

[01,3 

(C  v)]2dtt; 

Q 

J 

1^3 

/ 

ty3  (£,v)]2dtt  = 

~ ^ J [01,0 

(£,ri)}2dtto 

+A2/ 

[01,1 

(Z,v)]2dtti 

Uo 


(5.49) 
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+A2/  [0lj2  (£,  p)]2  di}2  + J [0i,3  (£,  ??)]2<^3  — 1 

^2  ^3 

Recall  that  the  scaling  functions  are  orthonormal.  Then,  the  integrals  of  the  form 

f [<l>i,i(Z,v)]2<KU,  i = 0,  - - - , 3 

reduce  to  1 and  we  have 

{a,/?,A}=  { — 

Therefore,  the  orthonormal  wavelets  are  defined  as 

■=  ;^=0i,o(f,»7)  - -j=<h,i(€,n) 

ip2(Z,v)  ■= 

■=  v)  - 

and  the  coarsest-scale  wavelet  space  W0  is  defined  as 


(5.50) 

(5.51) 


;Wf,i l)  (5-52) 


Wo  :=  span  {-01,  02,  V’3}  (5.53) 

In  Ref.  [4],  it  is  demonstrated  that  once  the  wavelets  that  span  W0  are  defined, 
the  wavelets  that  span  the  higher-resolution  spaces  {Wj},  j > 1,  can  be  generated 
through  a recursive  application  of  the  formula 

3 

Wj+i  :=  @TiWj  (5.54) 

i=0 


where  the  spaces  are  nested: 


Wj+i  2 Wj, 


J = 0, 1,2, ... 


(5.55) 
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Therefore,  the  wavelet  spaces  {Wj}  are  constructed  using  the  same  operators  as  the 
scaling  function  spaces  {Vj}.  It  should  be  noted  that,  since 

Vi+1  = Vi  © Wj 

and  dim(Vj)  = 4-1,  the  dimension  of  each  space  Wj  is  given  by 

dim(Wj)  = dim(Vj+l ) - dim{Vj)  = 4j+1  - 4j  = 3 • 4j  (5.56) 

Because  the  wavelets  that  span  the  spaces  {Wj}  are  generated  by  applying  the  same 
technique  that  was  used  for  the  scaling  functions,  we  define 

V£k(£,  v)  ■=  2 j(Tki  O Tkj_x  O ■ ■ ■ O Tkl  O 77)),  r — 1,  2, 3 (5.57) 

Equation  (5.57)  is  of  the  same  form  as  equation  (5.39),  which  defines  an  arbitrary 
scaling  function  4>j,«.-  The  2J  term  is  a normalization  factor  and  the  multi-index 
IK  defines  the  sequence  of  mappings  that  generates  each  wavelet  from  the  original 
wavelet  in  Wo-  The  wavelet  spaces  {Wj}  are  then  defined  as 

Wj  :=  span  {^rjK  : r=  1,2, 3}KeA.  (5.58) 

We  now  have  a multiresolution  analysis  over  the  domain  il.  Note  that  the  wavelet 
complement  spaces  {Wj}  are  orthogonal  to  one  another.  This  can  be  demonstrated 
through  a recursive  application  of  the  relationship 

Vj+1  = Vj  © Wj  (5.59) 

where  Wj  is  orthogonal  to  Vj.  But  equation  (5.59)  can  also  be  written  as 


Vj. )-i  — {Vj- 1 © Wj- 1)  © Wj 


(5.60) 
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Since  Wj-\  C V3  and  V3  is  orthogonal  to  Wj,  it  follows  that  Wj_ i is  orthogonal  to  W3 . 
Therefore,  using  the  result  from  equation  (5.35),  we  have  the  following  orthogonal 
decomposition  of  L2(Q): 


L2(Sl)  = 14,0 


(5,61) 


5.2  Triangular  Wavelet  Decomposition  and  Reconstruction 
To  this  point,  a multiresolution  analysis  over  the  set  Q has  been  constructed  using 
the  method  detailed  by  Micchelli  and  Xu  [4].  Some  of  the  tools  that  will  enable  us 
to  apply  this  multiresolution  analysis  to  the  approximation  of  functions  in  L2(Q)  can 
now  be  developed.  First,  the  two-scale  relationships,  which  determine  how  coarser- 
scale  scaling  functions  and  wavelets  are  obtained  as  weighted  combinations  of  scaling 
functions  on  the  next-finer  scale,  must  be  derived.  Recall  that  the  scaling  function 
and  wavelets  on  level  0 have  already  been  expressed  in  terms  of  the  scaling  functions 
on  level  1.  We  have  the  expressions 


0(f,»7)  = 

^0i, o(f , v)  + ^01,l(£>  v)  + \<t> l,2(f , v)  + ^0i,s(£,  v) 

(5.62) 

^0 i,o(£’V)  - ~^0i,i(£>  ri) 

(5.63) 

02  (£,??)  = 

■J=<I>1,2{Z,V)  - ^01,3^,  v) 

(5.64) 

^3(e,h)  = 

^0i,o(^i  v)  + ^0i, i(Z,v)  - ^0 wfav)  ~ 

(5.65) 

These  relationships  can  be  written  in  the  form 


3 

<? ) = J2ap<l>i,p({’V)  (5-66) 

p= o 
3 

r = 1>2-3 

p= 0 


(5.67) 
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where  the  scaling  function  filters  {ap}  and  the  wavelet  filters  {bp}  are  defined  as 


{ap}  — {a0,ai,  a2,a3}  — j-,  -, -, 

{' bl } = = {0,0’7f’_7f} 

{%}  = {blblblbl}  = [\, i -i  -±} 


(5.68) 

(5.69) 

(5.70) 

(5.71) 


Equations  (5.66)  and  (5.67)  represent  the  two-scale  relationships  between  the  scaling 
functions  and  wavelets  on  level  0 and  the  scaling  functions  on  level  1.  These  equations 
can  be  generalized  so  that  they  apply  over  any  two  adjacent  resolution  levels.  To 
show  this,  an  arbitrary  sequence  of  linear  operators,  Tkl,Tk2, ... ,Tkj , kx  G {0, 1,  2,  3}, 
is  applied  to  both  sides  of  equations  (5.66)  and  (5.67): 


3 


Tkj  o • 

ST- 

'O- 

o 

o 

— ^ ap  ° • 

p= 0 

■■°Tkl  O 01,p(£,77)} 

(5.72) 

Tkj  o • • 

t- 

o 

o 

= Em°- 

p= 0 

■ ° Tkl  o </>1iP(£,?7),  } 

(5.73) 

where  the  operators  {!)}  are  as  defined  in  equation  (5.23).  Note  that  equations  (5.72) 
and  (5.73)  take  advantage  of  the  fact  that  each  Tj  is  a linear  operator.  Recall  that 
the  scaling  functions  and  wavelets  on  any  given  level  j are  defined  as 


0 = 2J(7fc,  o o ■ ■ • o Tfc,  o <)>(£,  n)) 

Pixie, n)  = y(Tlj.rl).1..„oTllofK,,)) 


Then,  equations  (5.72)  and  (5.73)  can  be  written  as 


(5.74) 

(5.75) 
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3 

<Aj,k(£,  v)  = ^^2ap{Tkjo---oTklo(l)ltP(^rj)} 

p= o 
3 

0J,k(£>  v)  = bp  { Tki  °---°Tk1°  01,  p{£,  V ),  } 

p=0 

Note  that 


0 i,p(€>y)  = 2Tp°0(00 


which  implies  that 


Tkjo---oTklo(f)liP(^rj)  = 2Tkj  o ■■■oTkloTpo  (/)(£,  ri) 

= 2-2~^+1Uj+iM^v) 

= 2 J 0j'-(-1,p|k(^i  ??) 

where  p|K  denotes  the  multi-index  pkik2...kj,  of  length  j + 1.  It  follows  that  the 
generalized  two-scale  equations  are  given  by 


3 


0j,ic(£>h)  ^ ' Qp0j+l,p|IK(lC;  V) 

p= 0 

(5.76) 

3 

v)  = r==1>2’3 

p= 0 

(5.77) 

Any  function  / £ L'2(Q)  can  be  approximated  as  a linear  combination  of  the  scal- 
ing functions  that  span  a fine-scale  space  Vj.  The  resulting  approximation,  denoted 
as  fj,  is  given  by 


fj(^v)  = ahK0j,K(£,  V) 

Kg  A j 


(5.78) 
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where  {o^,k}  represent  constant  scaling  function  coefficients.  Equation  (5.78)  is  a 
single-scale  expansion  of  the  function  / on  level  j.  Recalling  that 

Vj  = Vj-x  © Wj- 1 (5.79) 

a two-scale  expansion  of  fj  can  be  written  as  a linear  combination  of  the  scaling 
functions  and  wavelets  on  the  coarser- resolution  level  j — 1: 

3 

fjitv)  = aj-i.K0i-i,K(f,»7)  + 5^  Pj-iX&j-ixi&v)  (5-8°) 

KgAj—i  r=l 

Applying  equation  (5.79)  recursively,  we  have  the  following  decomposition  of  the 
subspace  V): 


Vj  — Vo  © Wo  © Wi  © • • • © Wj-2  ffi  Wj—i  (5.81) 

which  leads  to  a multilevel  representation  of  fj  of  the  form 

3 j- 1 

V)  = <*</>(£,  V)  + ft  k^k  (5-82) 

r=0  /=0  KgAj 

Note  that  the  first  term  on  the  right-hand  side  of  equation  (5.82)  is  the  contribution 
of  the  scaling  function  </>  that  spans  Vo- 

Recall  that  the  scaling  functions  on  the  same  level  are  orthonormal,  so  that  they 
satisfy 


J 0j,K  (f  i v)  (€>  V)  dV,  = (5.83) 

n 

In  addition,  the  scaling  functions  on  any  given  level  are  orthogonal  to  the  wavelets 
on  levels  of  the  same  or  finer  resolution: 
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J <t>j,K  it  v)  (t  v)  dtt  = ° VK,P,  re  {1,2,3},  j <1  (5.84) 

n 

The  wavelets  on  any  given  level  are  orthonormal  to  the  wavelets  on  the  same  or  any 
other  level: 


J V’i.K  it  v)  Vi,w  it  v)  dtt  = %,r,K),(/,s,P)  (5.85) 

n 

The  orthonormality  properties  listed  in  equations  (5.83)  through  (5.85)  greatly 
simplify  many  of  the  calculations  that  are  made  with  the  scaling  functions  and 
wavelets.  To  start,  the  scaling  function  coefficients  in  equation  (5.78)  can  be  computed 
by  multiplying  both  sides  of  the  equation  by  4>3f  and  integrating  over  12.  Making  use 
of  equation  (5.83),  we  have 


atj,K  = J fit  v)<t>j,K  (t  v)  (5-86) 

n 

It  is  not  difficult  to  derive  relationships  between  the  fine-scale  coefficients  that  appear 
in  the  single-scale  expansion  of  / in  equation  (5.78)  and  the  multiscale  coefficients 
that  appear  in  equation  (5.82).  Towards  this  goal,  equations  (5.78)  and  (5.80)  can 
be  combined  into  the  expression 


3 

Y aj^jAt  v)  = Y Qb-i,K0j-i,K(£,  V)  + Y Kit  V)  (5-87) 

K K r= 1 K 

Suppose  that,  knowing  the  finer-scale  expansion  coefficients  {a^},  we  wish  to  calcu- 
late the  coarser-scale  coefficients  and  K}.  A decomposition  formula  for 

the  scaling  function  coefficients  can  be  obtained  by  multiplying  both  sides 
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of  equation  (5.87)  by  the  scaling  function  <f>j-i,v  and  integrating  over  f 1 The  ortho- 
normality  properties  in  equations  (5.83)  and  (5.84)  enable  the  resulting  equation  to 
be  simplified  to  the  form 

otj- qp  = Y a^K  / v)<I>3-iA£> 3 * * *  7l)dtt  (5.88) 

K n 

The  two-scale  relationship  given  in  equation  (5.76)  can  be  used  to  substitute  for  4>j-i,p 
in  equation  (5.88).  Then,  we  obtain 

3 

Otj — 1 = ^ ^ ^ 

K m=0 

Once  again  using  fact  that  the  scaling  functions  are  orthonormal,  equation  (5.89) 
simplifies  to 

3 

K m= 0 

Finally,  the  following  decomposition  formula  is  obtained: 

3 

] &maj,m |P  (5.90) 

m= 0 

In  a similar  manner,  decomposition  formulas  for  the  coarser-scale  wavelet  coefficients 
can  be  derived  by  multiplying  both  sides  of  equation  (5.87)  by  the  wavelet  V’J-i.p  and 
integrating  over  Q.  The  orthonormality  properties  and  the  two-scale  relationships 
given  in  equation  (5.77)  lead  to  the  result 

3 

= E >-=1,2,3  (5.91) 

m= 0 

Note  that  equations  (5.90)  and  (5.91)  can  be  applied  recursively.  For  example,  if  the 
scaling  function  coefficients  {o^k}  on  some  level  j are  known,  they  can  be  used  to 
calculate  the  scaling  function  and  wavelet  coefficients  on  level  j — 1.  Then,  applying 


J </tk(£,  f?)0j,m|p(f>  (5-89) 

n 
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equations  (5.90)  and  (5.91)  again,  the  scaling  function  coefficients  on  level  j — 1 
can  be  used  to  calculate  the  scaling  function  and  wavelet  coefficients  on  level  j — 2. 
This  process  can  be  continued  over  however  many  levels  there  are  in  the  multilevel 
expansion  of  the  function  / in  equation  (5.82).  In  this  manner,  the  entire  discrete 
wavelet  transform  can  be  computed  efficiently. 

Next,  suppose  that  the  reverse  operation  is  needed.  That  is,  given  scaling  func- 
tion and  wavelet  coefficients  on  some  level  j — 1,  we  seek  to  obtain  the  fmer-resolution 
scaling  function  coefficients  on  level  j , A reconstruction  formula  can  be  de- 

rived that  will  enable  the  calculation  of  these  finer-scale  coefficients.  This  formula  is 
obtained  by  multiplying  both  sides  of  equation  (5.87)  by  the  scaling  function  4>jy  and 
integrating  over  Vt.  We  obtain 


Substituting  for  4>j-i}K,  V'j- i,k>  an<^  V’f-i.K  using  the  two-scale  equations, 

equation  (5.92)  takes  the  form 


Then,  taking  advantage  of  the  orthonormality  properties  of  the  scaling  functions,  this 
reduces  to 


(5.92) 


3 


K m= 0 
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Finally,  the  following  reconstruction  formula  is  obtained: 

3 

®j,m\P  = amMj-l'IP  + ^ Knfij-  1,P  (5.93) 

r= 1 

Equation  (5.93)  can  be  used  to  calculate  the  scaling  function  coefficients  on  some 
level  j from  the  coarser-resolution  scaling  function  and  wavelet  coefficients  on  level 
j - 1.  Of  course,  it  can  be  applied  recursively  to  perform  reconstruction  over  multiple 
levels  in  calculating  the  inverse  discrete  wavelet  transform. 


CHAPTER  6 

WAVELET-BASED  IDENTIFICATION  OF  VOLTERRA  KERNELS 


The  first  five  chapters  have  presented  the  necessary  background  for  applying 
wavelets  to  the  identification  of  Volterra  kernels.  In  this  chapter,  two  wavelet-based 
kernel  identification  algorithms  are  described  in  detail.  The  first  implementation 
uses  the  triangular  wavelets  derived  in  Chapter  5 to  identify  the  triangular  form  of 
the  second-order  kernel.  This  algorithm  also  employs  the  Haar  wavelet  for  the  rep- 
resentation of  the  first-order  kernel.  The  second  wavelet  implementation  uses  the 
piecewise-polynomial  multiwavelets  constructed  in  Chapter  4 to  identify  first  and 
second-order  kernels.  In  this  case,  the  symmetric  form  of  the  second-order  kernel  is 
identified.  Before  these  wavelet-based  algorithms  are  discussed,  a survey  of  kernel 
identification  techniques  is  given  in  Section  6.1.  Many  approaches  to  kernel  iden- 
tification, including  the  wavelet  implementations  discussed  here,  lead  to  a matrix 
formulation  in  which  a least-squares  solution  must  be  obtained  for  a vector  of  kernel 
coefficients.  Because  kernel  identification  is  an  inverse  problem  that  is  frequently 
ill-posed,  regularization  techniques  are  often  needed  to  obtain  stable  solutions  to  the 
least-squares  problem.  This  is  discussed  in  Section  6.4  as  it  is  a critical  issue  in 
the  implementation  of  the  wavelet  algorithms.  Finally,  in  Section  6.5,  some  of  the 
finer  details  and  issues  concerning  the  wavelet  implementations  are  reviewed  for  the 
interested  reader. 


6.1  Kernel  Identification  Techniques 

In  the  analysis  of  nonlinear  dynamical  systems,  the  goal  is  to  identify  the  kernels 
that  appear  in  the  Volterra  series  representation  of  the  system  output.  As  discussed 
earlier,  these  kernels  characterize  a given  system  since  they  comprise  the  operators 
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that  map  the  input  into  the  output.  There  are  many  approaches  to  the  kernel  iden- 
tification problem.  As  discussed  earlier,  when  the  system  is  described  in  terms  of  an 
ODE,  the  kernels  can  be  derived  analytically.  Another  approach  is  to  discretize  the 
Volterra  series  and  solve  a least-squares  problem  for  the  discrete  kernel  coefficients.  In 
some  cases,  specific  inputs  are  given  to  the  system  in  order  to  measure  the  Volterra 
kernels  in  the  time  or  frequency  domain.  Other  methods  are  statistical  in  nature 
and  involve  the  use  of  Gaussian  white  noise  inputs.  In  many  approaches,  including 
the  wavelet  algorithms  discussed  later  in  this  chapter,  the  kernels  are  expanded  in 
terms  of  a set  of  basis  functions.  The  coefficients  in  the  representation  of  the  kernels 
must  then  be  determined.  These  standard  approaches  and  others  are  reviewed  in  this 
section. 

A natural  approach  to  kernel  identification  is  to  simply  discretize  the  continuous- 
time Volterra  series  and  solve  for  the  discrete  kernel  coefficients.  The  discrete-time 
representation  of  the  Volterra  series  for  a stationary,  causal,  nonlinear  dynamical 
system  is  of  the  form  [15] 


oo  k k 

y = XI  S hn  • • • ’ in)u  - *1)  • • • u - in)  (6.1) 

n=l  ii=0  in= 0 

In  equation  (6.1),  it  is  assumed  that  the  input  and  output  sequences  range  from 
k = 0 to  N.  Then,  equation  (6.1)  yields  N + 1 equations  for  the  discrete  outputs. 
Furthermore,  it  is  assumed  that  the  initial  conditions  are  zero  so  that  there  is  no  zero- 
order  term.  Of  course,  in  practice  one  cannot  identify  an  infinite  number  of  Volterra 
kernels,  and  the  series  of  operators  is  truncated  at  some  integer  M.  As  an  example, 
assume  that  the  series  has  been  truncated  to  include  only  the  first  and  second-order 
operators  (M  = 2).  An  output  vector  y and  a vector  of  discrete  kernel  coefficients  h 
can  be  defined  as 


y = [y  (0)---y(iV)]T 
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h = [h1(0)h1{l)---h1{N)h2(0,0)h2(p,l)---h2(0,N) 
h2(  1,0)  • • ■ Ml,  IV)  • • • h2(N,  0)  ■ • • h2(N,  Mi- 


Then,  defining  a matrix  of  input  values 


u{  o) 

u(l) 

u(N) 

0 

u{  0) 

u (N  — 1) 

0 

0 

u(N  — 2) 

0 

0 

u(0) 

u 2 (0) 

u2(l) 

u2  (N) 

0 

u(l)u(0) 

u(N)u(N-  1) 

0 

0 

u(N)u(N-  2) 

0 

0 

u(N)u(0) 

0 

u(0)u(l) 

u (N  — 1)  u (N) 

0 

u2  (0) 

u2  ( N - 1) 

0 

0 

• • u(N  — l)u(N  — 2) 

0 

0 

u(N  — l)u(0) 

0 

0 

u(0)u(N) 

0 

0 

u2(  0) 

the  N + 1 discrete  output  equations  given  by  equation  (6.1)  can  be  written  in  matrix 


form  as 
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y=[u]  h (6.2) 

In  general,  [U]  is  a (N  + 1)  x (^2  (N  + 1)"^  matrix,  where  N + 1 is  the  number  of 
discrete  inputs  and  M represents  the  highest-order  operator  included  in  the  Volterra 
series  representation.  Equation  (6.2)  can  be  solved  for  the  values  of  the  discrete  kernel 
coefficients  in  h using  a least-squares  method.  Unfortunately,  with  this  approach,  the 
higher-order  nonlinear  kernels  are  represented  in  terms  of  a prohibitively  large  number 
of  discrete  coefficients.  In  general,  if  the  input  and  output  sequences  are  of  length 
jV-fl,  the  discrete  nth-order  Volterra  kernel  is  composed  of  Nn  = (iV  + l)n  terms  that 
must  be  identified.  This  is  a serious  limitation  of  this  simple  discrete  approach.  The 
number  of  coefficients  in  the  discrete  model  can  be  reduced  significantly  by  assuming 
truncated  forms  of  the  Volterra  kernels.  Because  most  physical  systems  have  finite 
memory,  meaning  that  a current  input  will  have  an  influence  on  future  outputs  for 
a finite  time,  the  corresponding  Volterra  kernels  can  be  expected  to  decay  to  zero 
in  a finite  time.  This  is  consistent  with  the  concept  of  fading  memory  introduced  in 
Section  2.5.  For  example,  for  a linear  system,  the  memory  length  represents  the  time 
it  takes  for  the  impulse  response  function  (or  equivalently,  the  first-order  kernel)  to 
decay  to  zero.  With  the  assumption  of  finite  memory,  the  first-order  kernel  can  be 
expressed  in  terms  of  Nx  = faTx  + 1 unknown  coefficients,  where  fs  is  the  sampling 
frequency  of  the  input/output  data  and  Tx  is  the  time  at  which  the  kernel  has  been 
truncated.  Similarly,  truncating  the  second-order  kernel  at  time  T2,  the  number  of 
discrete  second-order  kernel  coefficients  can  be  reduced  to  N2  = {fsT2  + l)2.  This  can 
result  in  a significant  reduction  in  the  number  of  coefficients  that  must  determined 
in  equation  (6.2),  especially  if  the  input/output  data  sets  are  very  large.  Even  so, 
the  number  of  coefficients  is  still  typically  very  large.  Therefore,  the  simple  discrete 
approach  described  above  has  limited  utility  in  practice. 
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Another  approach  to  kernel  identification  is  to  use  impulsive  inputs  to  measure 
the  kernels  of  time-invariant  systems  [15],  [16].  This  is  a generalization  of  the  fact  that 
the  first-order  kernel  for  a linear  system  is  equivalent  to  the  impulse  response  of  the 
system.  This  can  easily  be  demonstrated  by  evaluating  the  output  of  the  first-order 
Volterra  operator  for  a unit  impulse  input  <5>0,  defined  as  a function  of  infinitesimal 
width  and  unit  area  centered  at  the  origin.  The  first-order  response  to  this  input  is 
given  by 


y{t) 


/ hi(t  - £)u(£)d£ 
Jo 

[ 

Jo 

h\(t) 


(6.3) 


where  the  sifting  property  of  the  impulse  function  has  been  employed.  Now  consider  a 
weakly  nonlinear  system  that  can  be  described  in  terms  of  the  first  and  second-order 
Volterra  kernels.  The  response  yQ  of  the  system  to  the  impulse  input  So  is  given  by 


y0(t)  = [ hi(t  - £)6Q(Z)d£ 

Jo 

+ [ I h2{t  -£,t  - v)6o(£)&o(v)d£dr) 

Jo  Jo 

= hi(t)  + h2{t,t)  (6.4) 

It  should  be  noted,  then,  that  the  response  y0  of  the  nonlinear  system  to  an  impulse 
input  is  not  equivalent  to  the  first-order  kernel  but  rather  it  is  given  as  the  sum  of  the 
first-order  kernel  and  the  diagonal  component  of  the  second-order  kernel.  The  com- 
ponents of  the  second-order  kernel,  as  well  as  the  first-order  kernel,  can  be  measured 
by  applying  double  impulses  of  the  form 
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u1(t)  = S0(t)  + 60(t-T)  (6.5) 

It  is  not  difficult  to  show  that  the  response  y\  to  this  input  is  given  by 

2/i (t)  = h^t)  + hi(t  -T)  + h2(t , t)  + 2h2(t,  t — T)  + h2(t  — T,t  — T)  (6.6) 

where  h2(t  — T,t)  — h2(t,t  - T)  due  to  the  symmetry  of  the  second-order  kernel. 
Note  that,  due  to  the  time-invariance  of  the  system,  the  response  to  the  time-shifted 
impulse  60(t  — T ) is  simply  y0(t  — T ),  given  as 


yo(t  — T)  = h\{t  — T)  + h2(t  — T,  t — T)  (6-7) 

Then,  using  equations  (6.4)  and  (6.7),  equation  (6.6)  can  be  written  as 

Vi(t)  = yo(t)  + y0(t  - T)  + 2 h2(t,  t-T)  (6.8) 

Then,  the  component  h2(t,t  - T ) of  the  second-order  kernel  is  given  by 

h2(t,  t-T)  = | (yi(t)  - y0(t)  - y0(t  - T))  (6.9) 

In  the  square  domain  of  support  of  the  second-order  kernel,  the  component  h2(t,t  — T ) 
can  be  interpreted  as  the  value  of  the  kernel  along  the  line  y = £ — T.  Equation  (6.9) 
implies  that,  in  theory,  one  can  measure  a given  component  of  the  second-order  kernel 
by  applying  the  inputs  <50  and  u\  to  the  system.  The  measured  responses  ?yo  and  y\ 
can  then  be  used  to  calculate  the  component  h2(t,  t—T ).  By  varying  the  time  delay  T, 
other  components  of  the  second-order  kernel  can  be  measured.  As  noted  by  Silva  [27], 
the  first-order  kernel  can  be  extracted  by  applying  the  input  25o-  The  response  y2  to 
this  input  is 
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y2{t)  = 2 hi(t)  +4  h2(t,t) 


(6.10) 


Then,  combining  this  result  with  equation  (6.4),  the  first-order  kernel  can  be  obtained 
as 


It  is  straightforward  to  extend  this  approach  to  include  the  contributions  of  higher- 
order  kernels.  The  difficulty  with  this  technique  is  that,  for  many  systems,  it  is 
impossible  to  apply  an  accurate  impulse  input.  For  example,  there  is  no  way  to  deliver 
an  accurate  impulse  to  a test  section  in  a wind  tunnel  or  an  aircraft  during  a flight 
test.  This  method  has  been  applied  in  some  cases,  however.  For  instance,  Tawfiq  and 
Vinh  [102]  have  applied  this  technique  to  study  structural  nonlinearities.  In  order 
to  do  this,  they  developed  a special  hammer  that  is  capable  of  delivering  multiple 
impulses  over  short  time  intervals.  Silva  [27, 28]  has  successfully  measured  first  and 
second-order  Volterra  kernels  for  nonlinear  aeroelastic  systems  by  applying  discrete- 
time unit  impulses  to  CFD  models  of  these  systems.  In  related  work,  Raveh  [29, 30] 
has  investigated  the  use  of  step  inputs  in  CFD  models  to  identify  step-type  kernels 
for  nonlinear  aerodynamic  systems. 

One  of  the  most  common  approaches  to  kernel  identification  is  the  harmonic 
probing  technique,  in  which  the  kernels  are  measured  in  the  frequency  domain  through 
the  use  of  harmonic  inputs.  Consider,  once  again,  a weakly  nonlinear  system  that  is 
modeled  in  terms  of  the  first  and  second-order  Volterra  operators.  The  response  of 
this  system  to  a harmonic  input  ejT“  is  given  by 


hi (t)  = 2y0 (()  - \y2(t) 


(6.11) 
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Jo  Jo 

= [ ejnt 

Jo 

(6.12) 

+ t f h2(^V)e-j^e-jQr)d^dV  e2'nt 
Jo  Jo 

Recall  from  Chapter  2 that  the  frequency  response  functions  of  the  first  and  second- 
order  kernels,  also  known  as  Volterra  kernel  transforms,  are  defined  as 

roc 

H\{juj)  = / hiOe-^dZ 

J — oo 

(6.13) 

roo 

H2(jui , M)  = / M? , rfte-^'^didn 

J — oo 

(6.14) 

Then,  equation  (6.12)  can  be  written  as 

y(t)  = H1(jQ)eiat  + H2(jn,jn)e2*Qt 

(6.15) 

In  this  manner,  the  first-order  kernel  transform  H\  can  be  determined  over  any  range 
of  frequencies  by  varying  the  input  frequency.  At  the  same  time,  only  diagonal 
components  of  H2  in  the  {cui,uj2)  frequency  domain  can  be  measured.  In  order  to 
determine  H2  at  off-diagonal  points,  a dual-tone  harmonic  signal  of  the  form 

u{t)  = ejQlt  + ejU2t 

(6.16) 

is  employed.  The  output  corresponding  to  this  input  is  given  by 


y(t)  = 

+2H2(j{l1,jn2)eW'+W  + H2(jn2jn2)ej2^ 


(6.17) 
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Then,  it  is  possible  to  determine  the  value  of  the  second-order  frequency  response 
H2  at  the  point  , j,X22)  as  half  the  magnitude  of  the  response  at  the  frequency 
+ fl2.  It  is  not  difficult  to  generalize  these  results  to  include  the  measurement  of 
higher-order  kernels.  The  harmonic  probing,  or  multitone  method,  was  first  intro- 
duced by  Bedrosian  and  Rice  [103].  Worden  et  al.  [104]  extended  the  technique  to 
measure  the  cross-kernels  appearing  in  the  multi-input  Volterra  series.  A drawback 
of  harmonic  probing  is  that  measuring  the  kernels  using  dual  tone  signals  can  be  very 
time  consuming  as  many  tests  are  required  to  extract  the  kernels.  To  address  this 
concern,  Boyd  and  Chua  [59]  employed  input  signals  composed  of  several  harmonics 
for  the  purposes  of  measuring  the  kernels  at  multiple  points  for  each  input.  The 
input  signals  were  designed  such  that  as  few  output  frequencies  as  possible  would 
correspond  to  the  input  frequencies.  An  output  frequency  that  matches  the  input 
frequency  is  useless  for  measuring  higher-order  kernels  since  there  is  always  a con- 
tribution from  the  first-order  kernel  at  that  frequency.  Evans  et  al.  [105]  and  Weiss 
et  al.  [106]  further  improved  on  this  method,  designing  input  signals  that  maximize 
the  number  of  distinct  points  in  the  kernel  that  can  be  measured  with  one  input. 
In  addition,  they  sought  to  achieve  a nearly  even  spacing  between  measured  points 
while  minimizing  the  largest  frequency  contribution.  One  limitation  of  these  multi- 
tone  methods  is  that  they  require  specific  inputs  that  may  not  be  feasible  in  a noisy 
experimental  environment. 

Another  classical  kernel  identification  technique  is  the  cross-correlation  method 
developed  by  Lee  and  Schetzen  [58].  In  this  technique,  the  kernels  associated  with  the 
Wiener  series  are  estimated  instead  of  Volterra  kernels.  The  Wiener  series  corresponds 
to  an  orthogonalization  of  the  Volterra  series  via  the  Gram-Schmidt  procedure  under 
the  assumption  of  a Gaussian  white  noise  input  [13].  The  advantage  of  the  Wiener 
series  is  that  the  terms  in  the  series  are  orthogonal  and  can  be  estimated  one  at  a 
time.  In  contrast,  the  Volterra  series  is  not  orthogonal  and  the  Volterra  kernels  must 


be  identified  simultaneously.  However,  a major  drawback  of  the  Wiener  series  is  that, 
unlike  the  Volterra  series,  the  kernels  are  input-dependent.  The  first  few  terms  in  the 
Wiener  series  are  given  as  [58] 


Go[ko,u(t)]  — ko 


(6.18) 


(6.19) 


o 


(6.20) 


where  o\  denotes  the  variance  of  the  input.  The  Wiener  operators  of  second  order  and 
higher  are  directly  dependent  on  the  input  variance.  The  cross-correlation  method 
is  usually  applied  as  a discrete-time  procedure.  First,  as  noted  in  Ref.  [96],  the 
zero-order  Wiener  kernel  is  a constant  equal  to  the  output  mean  \xy\ 


where  N is  the  length  of  the  data  record.  The  first-order  Wiener  kernel  can  be 
shown  to  be  proportional  to  the  first-order  cross-correlation  (j)uy  between  the  input 
and  output.  In  practice,  however,  the  first-order  kernel  is  defined  in  terms  of  the 
first-order  cross-correlation  between  the  input  and  the  zero-order  residual,  defined 
as  [96] 


(6.21) 


t=  1 


vQ{t)  = y(t)  - k0 


(6.22) 


Then,  the  first-order  kernel  can  be  obtained  as 
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Similarly,  the  second-order  Wiener  kernel  is  proportional  to  the  second-order  cross- 
correlation 4>Uuui  between  the  input  and  the  first-order  residual  [96]: 


Mn,  a) 


2ot 


yUUL'  1 


(n,T2)  = 


1 

2Nai 


N 

E 

i=l+max(n,r2) 


u 


(t  - Ti)u(t  - T2)vi(t)  (6.24) 


where  the  first-order  residual  is  given  by 


N—l 

V\(t)  = VQ(t)  - ^2  ki(r)u(t  - r)  (6.25) 

T— 0 

In  general,  the  nth-order  kernel  is  proportional  to  the  nth-order  cross-correlation 
between  the  input  u and  the  (n  — l)th-order  residual  vn-\  [96].  As  noted  by  several 
authors,  a disadvantage  of  the  cross-correlation  method  is  that  it  typically  requires 
a very  large  data  set  in  order  to  obtain  accurate  kernel  estimates.  Furthermore, 
the  technique  requires  a Gaussian  white  noise  input,  which  is  not  always  possible  in 
practice. 

The  impulse  input  and  harmonic  probing  techniques  are  examples  of  direct  ap- 
proaches to  kernel  identification  in  which  the  kernels  are  measured  in  terms  of  the  sys- 
tem response  to  specific  inputs.  The  cross-correlation  method  is  a statistical  method 
that  also  requires  a specific  type  of  input.  In  contrast,  with  indirect  methods  such 
as  the  simple  discrete  model  described  earlier,  the  kernels  are  identified  from  in- 
put/output data  from  the  system.  Many  indirect  approaches  entail  expanding  the 
kernels  in  terms  of  a set  of  basis  functions.  Then,  the  kernel  identification  problem 
becomes  one  of  solving  for  the  expansion  coefficients  that  represent  the  kernels.  A 
popular  choice  of  basis  is  the  Laguerre  polynomials,  which  were  first  suggested  by 
Wiener  [13]  for  the  representation  of  Wiener  kernels.  They  were  later  employed  by 
Marmarelis  [60]  in  the  identification  of  kernels  for  biological  systems.  The  discrete 
Laguerre  functions  take  the  form  [60] 
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iS(r)  = «H/>(l-I,)>/ip-l)‘Q(')rf-‘(l-a)‘  t > 0 (6.26) 

where  j is  the  order  of  the  Laguerre  function.  The  parameter  a,  0 < a < 1,  deter- 
mines the  exponential  rate  of  decay  of  these  functions.  As  noted  by  Marmarelis  [60] , 
the  choice  of  a is  critical  to  the  effectiveness  of  the  Laguerre  approach.  As  a basic  rule, 
as  a increases,  the  Laguerre  functions  become  more  dilated  and  are  better  suited  for 
approximating  kernels  with  longer  memory.  Similar  to  this  approach,  Reisenthel  [34] 
used  decaying  exponentials  to  approximate  the  first  and  second-order  kernels  for 
nonlinear  aeroelastic  systems.  Korenberg  and  Hunter  [107]  have  noted  that  using 
smooth  basis  functions  such  as  the  Laguerre  polynomials  almost  always  results  in 
smooth  kernel  estimates.  Therefore,  they  may  not  be  effective  in  approximating  ker- 
nels with  sharp  features.  Nikolaou  and  Mantha  [61]  applied  biorthogonal  wavelets  to 
compress  the  first  and  second-order  kernels  that  model  a particular  chemical  reaction. 
They  used  a priori  knowledge  about  the  plant  to  determine  which  wavelet  coefficients 
to  retain  in  the  model.  Kurdila  et  al.  [62]  also  used  biorthogonal  wavelets  for  the 
approximation  of  Volterra  kernels  for  an  experimental  aeroelastic  system  designed 
to  study  limit  cycle  oscillation  in  aircraft.  The  wavelet-based  kernel  identification 
algorithms  described  later  in  this  chapter  clearly  fall  into  the  same  category  as  these 
methods.  The  advantage  of  expanding  the  kernels  in  terms  of  a basis  set  is  that  there 
is  no  specific  input  excitation  required.  Therefore,  these  methods  can  typically  use 
physically  realizable  inputs.  A drawback  of  these  methods  is  that  they  often  lead  to 
ill-posed  problems,  as  discussed  in  Section  6.4. 

There  are  numerous  other  approaches  to  kernel  identification.  Sometimes,  the 
system  is  assumed  to  have  a specific  block  structure.  For  example,  the  Hammerstein 
model  assumes  that  the  system  is  composed  of  a static  nonlinearity  followed  by  a lin- 
ear dynamical  system.  Conversely,  the  Wiener  model  assumes  that  the  structure  of 
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the  system  corresponds  to  a linear  dynamical  system  followed  by  a static  nonlinearity. 


eled  by  a finite  number  of  Wiener  structures  connected  in  parallel.  Korenberg  [109] 
also  contributed  a fast  orthogonal  identification  method  that  is  frequently  used  in  bio- 
medical engineering  applications.  Neural  networks  provide  yet  another  tool  for  the 
estimation  of  Volterra.  kernels  [110],  [63].  Wray  and  Green  [63]  have  demonstrated  that 
many  artificial  neural  networks  have  equivalent  finite  Volterra  series  representations. 
As  a final  example,  adaptive  Volterra  filters  are  widely  used  in  electrical  engineering 
applications.  The  discrete  filter  coefficients  are  often  computed  recursively  using  the 
least  mean  squares  (LMS)  approach  or  the  recursive  least-squares  (RLS)  method. 
The  reader  can  refer  to  the  texts  by  Rugh  [15]  and  Schetzen  [16]  for  more  discussion 
of  kernel  identification  techniques.  In  addition,  Korenberg  and  Hunter  [107]  have 
written  a survey  paper  on  kernel  identification,  and  Westwick  and  Kearney  [96]  have 
conducted  an  extensive  review  of  nonparametric  system  identification. 


In  this  section,  the  first  of  two  wavelet-based  kernel  identification  algorithms  is 
described  in  detail.  In  both  cases,  it  is  assumed  that  we  have  a single-input/single- 
output system  with  weak  nonlinear ities.  More  specifically,  it  is  assumed  that  the 
output  can  be  expressed  in  terms  of  the  first  and  second-order  Volterra  operators  as 


Furthermore,  the  discussion  is  limited  to  time-invariant,  causal  systems  with  one-sided 
inputs  (that  is,  the  input  starts  at  time  t = 0).  Recall  that,  under  these  conditions, 
the  outputs  of  the  first  and  second-order  Volterra  operators  take  the  form 


Korenberg  [108]  developed  a parallel  cascade  algorithm  whereby  the  system  is  mod- 


6.2 Kernel  Identification  Using  Triangular  Wavelets 


y(t)  = yi(t)  + y2(t) 


(6.27) 


(6.28) 
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M£>  v)u(t  - Ou(*  - v)d£dri 


(6.29) 


In  this  section,  the  triangular  wavelets  derived  in  Chapter  5 are  used  to  approximate 
the  triangular  form  of  the  second-order  kernel.  Using  the  triangular  form  of  the 
kernel,  the  output  of  the  second-order  operator  can  be  written  as 


where  h2:tri  denotes  the  triangular  second-order  kernel.  A wavelet  basis  must  also 
be  chosen  for  the  representation  of  the  first-order  kernel.  The  Haar  wavelet  is  a rea- 
sonable choice  since,  like  the  triangular  wavelets,  it  is  a piecewise-constant  function. 
In  the  next  section,  the  piecewise-polynomial  multiwavelets  derived  in  Chapter  4 are 
used  to  approximate  the  first-order  kernel  and  the  symmetric  form  of  the  second- 
order  kernel.  Both  wavelet-based  kernel  identification  algorithms  result  in  a linear 
least-squares  problem  that  must  be  solved  for  the  wavelet  coefficients  that  represent 
the  kernels. 

In  order  to  derive  a discrete  form  of  the  first-order  output  in  equation  (6.28),  the 
input  is  first  discretized  by  employing  a zero-order  hold,  which  is  a piecewise-constant 
approximation  of  the  input.  In  a zero-order  hold  approximation,  each  time  the  input 
is  sampled,  it  is  held  constant  until  the  next  sample  is  taken.  Recall  from  Chapter 
3 that  the  Haar  scaling  function,  also  known  as  the  characteristic  function  of  the 
interval  [0,1],  is  defined  as 


(6.30) 


X(0  = 


1 0 < <£  < 1 


(6.31) 


0 otherwise 


Then,  a zero-order  hold  representation  of  the  input  can  be  written  in  terms  of  the 
Haar  scaling  function  as 


207 


N- 1 

Uj(0  = ujtkXj,k( 0 (6-32) 

fc=0 

where  we  have  the  usual  notation 

Xi,*(£)  = 2J/2X(2^  - k)  (6.33) 

and  AT  denotes  the  number  of  samples  taken.  The  dilation  index  j defines  the  sampling 
step  of  the  input  to  be  At  = 2~j . The  coefficients  {ujtk}  in  equation  (6.32)  are  scaled 
samples  of  the  input: 

ujik  = 2~^1 2u(2~jk),  k — 0,...,N  — l (6.34) 

The  output  is  also  discretized  with  a sampling  step  of  2-7  . Given  N samples  of  the 
output,  equation  (6.27)  yields  N equations  for  the  N discrete  first-order  outputs: 

ViAtn)  = yij(2-*n)  = ^ h,  {S)Uj(tn  - 0 de,  n=l,...,N  (6.35) 

do 

Note  that  the  subscript  j on  the  input  and  output  denotes  an  approximation  of  these 
functions  on  resolution  level  j.  From  equation  (6.32),  the  input  Uj(tn  — £)  can  be 
written  as 


71—1 


uj(tn  — 0 — uj{ 2 £)  — Uj,kXj,k(tn  0 (6.36) 

k= 0 

Note  that,  from  the  definition  of  the  characteristic  function  x and  equation  (6.33), 


Xj,k{tn  0 


1 2~>k  < 2-in-Z  < 2~i(k  + 1) 

0 otherwise 


208 


0 otherwise 


0 otherwise 


1 2~j{k  -n)<-£<  2 -j(k  - ra  + 1) 


1 2~j(n  - k - 1)  < £ < 2~j(n  - k) 


Xj,n— k— 1(0 


(6.37) 


Therefore,  we  have 


n—  1 


(6.38) 


fc=0 


The  next  step  is  to  approximate  the  first-order  kernel  in  terms  of  the  Haar  wavelet 


basis.  A single-scale  expansion  of  the  kernel  using  Haar  scaling  functions  takes  the 
form 


where  h\3  denotes  an  approximation  of  the  kernel  on  level  j and  {ohP}  are  scaling 
function  coefficients.  This  single-scale  representation  is  simply  a piecewise-constant 
approximation  of  the  first-order  kernel.  Strictly  speaking,  the  first-order  kernel  is  sup- 
ported over  the  entire  domain  [0,  tN],  where  tN  = 2 ~j N represents  the  time  duration 
of  the  data  set.  Therefore,  the  summation  in  equation  (6.39)  extends  from  p = 0 to 
N — 1,  where  N is  the  total  number  of  input/output  samples.  In  practice,  however, 
many  physical  systems  possess  finite  memory  in  that  the  present  input  influences 
future  outputs  for  a finite  amount  of  time.  This  implies  that  the  Volterra  kernels  of 
the  system  decay  to  zero  in  a finite  amount  of  time.  Assuming  that  the  first-order 
kernel  decays  to  zero  in  Tf  seconds,  the  upper  summation  in  equation  (6.39)  can  be 
truncated  to  Ah  - 1,  where  JVi  = 2_J'Ti.  The  assumption  of  a finite  kernel  greatly 
reduces  the  number  of  unknown  coefficients  that  must  be  identified,  especially  when 
the  input/output  data  set  is  large.  For  example,  consider  a case  where  16  seconds  of 


hiM)  ~ aj,pXj,p(£) 

p= o 


(6.39) 
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data  have  been  sampled  at  resolution  level  j = 8,  or  a sampling  rate  of  256  Hz.  Then, 
there  would  be  a total  of  4096  samples  of  the  input  and  output  and  the  representation 
of  a 16-second  first-order  kernel  would  require  4096  coefficients  in  equation  (6.39).  If, 
however,  the  kernel  has  a memory  of  only  2 seconds,  the  number  of  unknown  coef- 
ficients can  be  reduced  to  512.  Of  course,  one  does  not  usually  know  the  memory 
length  of  the  kernel  a priori  and  an  assumption  must  be  made  regarding  the  kernel 
size.  More  will  be  said  about  this  later,  but  for  now  it  is  assumed  that  the  first-order 
kernel  has  a memory  length  of  Tf  seconds. 

Substituting  the  truncated  kernel  in  equation  (6.39)  and  the  expression  for  the 
input  in  equation  (6.38)  into  equation  (6.35),  we  obtain 

./Vi-1  n—  1 „Ti 

EE  uj,kaj,p  / Xj,n—k—l{^)Xj,p{^)^  (6.40) 

p=0  k= 0 

The  upper  limit  of  integration  has  been  lowered  to  7\  since  the  kernel  has  been 
truncated.  By  the  orthonormality  of  the  Haar  scaling  functions,  we  have 

Ti 

Xj,n— k— 1 (0Xj,p(0^  = ^n—k—l,p  ~ $n—p—l,k 

Then,  equation  (6.40)  reduces  to 

Ni-l 

Vl,j (hi)  ~ ^ ^ Uj,n—p—  1 &j,p  (6-42) 

p=0 

To  be  more  precise,  the  upper  limit  in  this  summation  is  n — 1 for  n < Ni  and  Ah  — 1 
for  n > N\.  Equation  (6.42)  provides  N equations  to  be  solved  for  the  Ah  coefficients 
{aj-pl^o1  that  represent  the  first-order  kernel: 


(6.41) 


Vi,j  (H)  = (6.43) 

2/i  j (*2)  = 


uj,laj,0  + uj,Oaj,l 
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yi,j(tN)  — uj,N—lO(j,0  + UjtN-2&j,l  + • ' • + Uj'H-Ni&^Ni-l 
Defining  the  vectors 


y,  . := 

— i.j 

and  the  N x N\  input  matrix 


\ 

2/ij  (^i) 

/ \ 

a3, 0 

; 

ii 

si* 

j yi,j(tN) 

aj,  Ni-1 

\ / 

[Ui}  = 


“l,0 

0 

0 

0 

“1,1 

“1,0 

0 

0 

uj,N  i-l 

“l,Vi-2 

“1,1V  1-3  • ' 

“1,0 

“l.lVj 

“l.ATi-1 

“l,Wi— 2 • ' 

“1,1 

uj,N- 1 

uj,N—2 

“1,1V- 3 ' ' 

■ • U3,N-N! 

the  N expressions  in  equation  (6.43)  can  be  written  in  matrix  form  as 


yhj  = [cy%  (6.44) 

Equation  (6.44)  does  not  take  advantage  of  the  multiscale  structure  afforded  by 
the  wavelet  basis.  Recall  that  a multilevel  expansion  of  the  first-order  kernel  can  be 
written  as 


1- 1 

= V ajo,pXjo,p(0  + EE  A.ptMO  (6.45) 

P l=jo  P 

This  multiscale  representation  is  in  terms  of  Haar  wavelets  on  levels  j0  through  j — l 
and  the  scaling  functions  on  the  coarsest  level  j0  only-  Recall  that  this  representation 
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is  equivalent  to  the  single-scale  expansion  of  the  kernel  given  in  equation  (6.39).  For 
convenience,  a vector  of  multiscale  coefficients  is  defined  as 


where  aJo  denotes  all  scaling  function  coefficients  on  level  jo,  (3  represents  all  wavelet 
coefficients  on  level  j0,  and  so  forth.  In  Chapter  3,  it  was  shown  that  the  multilevel 
coefficients  in  (3  can  be  computed  from  the  single-scale  coefficients  via  a recursive, 
fast-filter  operation.  It  was  also  demonstrated  that  this  decomposition  can  be  written 
in  matrix  form  such  that 


= 


—jo 


4 

4+i 


4. 

4-i 


& = P\]  a,  (6.46) 

where  [T\]  is  the  matrix  realization  of  the  wavelet  transform  operation.  The  recon- 
struction of  aj  from  (3  is  then  simply  given  by 

QLj  = (6.47) 

Substituting  into  equation  (6.44),  we  have  an  expression  for  the  discrete  first-order 
output  in  terms  of  the  multiscale  kernel  coefficients: 

y,j  = [t/ilpy-V,  (6.48) 

In  many  cases,  the  wavelet  coefficients  that  correspond  to  some  of  the  higher-resolution 
levels  contribute  very  little  to  the  structure  of  the  kernel  and  can  be  neglected.  Then, 
the  Ah-dimensional  multiscale  vector  /?  can  be  truncated,  reducing  the  number  of 
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wavelet  coefficients  that  must  be  determined  to  N\  < N±.  Then,  equation  (6.49) 
reduces  to 


v,j  = [t/.Hr.l-'a  (6.49) 

where  /3  is  the  truncated  vector  of  multiscale  coefficients  and  [T\] — 1 represents  the 
first  N\  columns  of  the  reconstruction  matrix  [Ti]_1. 

In  deriving  a discrete  form  of  the  second-order  Volterra  operator,  the  same  dis- 
cretization is  used  for  the  input  and  output.  Therefore,  equation  (6.29)  yields  N 
equations  for  the  discrete  second-order  outputs: 

ftn  rrj 

2/2, j (^n ) = / / ^2 0 ^ji^n  V^d^dTJ,  U 1,  . . . , iV  (6.50) 

Jo  Jo 

where  tn  = 2 ~Jn.  The  zero  order  hold  approximation  of  the  input  takes  the  same 
form  as  before: 


71—1 

^j(^ra  0 ^ ' ^j,kXj,n— k— 1(0 

fc= 0 
n— 1 

k= 0 


n—  1 

^j{tn  V)  = ^ ^ uj,mXj,n— m—  1 {jj) 

m= 0 
n— 1 

= ^ ^ uj,n—m—lXj,m(V ) 

m= 0 


where 


ujik  = 2-j/2u(2~jk) 


(6.51) 


(6.52) 


(6.53) 


213 


Figure  6.1:  Triangular  Domain  of  Support  for  the  Second-Order  Volterra  Kernel 


Ujtm  = 2 j/2u( 2 3m) 


(6.54) 


Defining  the  two-dimensional  characteristic  function 


Xj,(k,m){t,iV)  Xj,k(OXj,m(V ) 


(6.55) 


we  have 


n— 1 n—  1 

uj{tn  ~ V)  — ^ ^ ] uj1n—k—luj,n—m-lXj,{k,rn){^i  V)  (6.56) 

k= 0 m=0 

Substituting  into  equation  (6.50),  we  obtain 


n— 1 n— 1 

U2j{tn)  — EE  ^7,71—fc— 1^7,71— 772— 1 

fc=0  771=0 


^2, tri 


(6.57) 


Next,  the  triangular  second-order  kernel  h^tri  is  approximated  in  terms  of  the 
triangular  wavelets  derived  in  Chapter  5.  The  second-order  kernel  is  discretized  on 
level  j2  which,  due  to  computational  considerations,  is  typically  a coarser  resolution 
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level  than  level  j.  A single-scale  expansion  of  the  triangular  kernel  on  level  j 2 can  be 
written  as 


h2,trij2(£,  V)  = Y^  al2,P <A?2,p(£.  V)  (6-58) 

p 

where  {0j2,p}  are  triangular  scaling  functions.  Recall  that  the  notation  P represents 
a j2-dimensional  multi-index  px . . .pj2  where  pi  G {0, 1,  2,  3},  i = 1, ...  ,72.  The 
total  number  of  coefficients  {ctj2: p}  in  equation  (6.58)  is  N2  = 4-i2.  Once  again,  it 
is  assumed  that  the  system  has  finite  memory  and  the  second-order  kernel  can  be 
truncated  at  some  time  T2  < t at.  Then,  the  kernel  is  supported  over  the  domain  f2 
shown  in  Figure  (6.2).  Substituting  the  single-scale  approximation  of  the  triangular 
kernel  into  equation  (6.57),  we  obtain 

n— 1 n— 1 „ 

D2,j(tn ) = EEE%-  -tU^n-m-iOij^P  / Xj,(k,m)(t,v)(/)j2A^v)dtdri  (6.59) 

fc=0  m= 0 P ^ 

The  truncation  of  the  second-order  kernel  at  time  T2  effectively  lowers  the  upper 
summation  limits  on  k and  m to  N — 1,  where  N = 2~jT2.  To  be  precise,  the  upper 
limits  of  these  summations  are  n — 1 for  n < N and  N — 1 for  n > N.  Furthermore, 
the  summations  over  k and  m involve  two-dimensional  characteristic  functions  over 
the  [0,  T2]  x [0,T2]  square  domain.  When  m < k,  the  function  Xj,(k,m)  is  n°t  in  the 
domain  fh  Hence,  the  integral  in  equation  (6.59)  is  always  zero  when  m < k and  the 
summation  limits  on  m can  be  changed  to  m — k to  N — 1: 


N- 1 N- 1 


y2,j{tn)  — EEE">.-‘  — l^j, n—m— l®j2, P / Xj,(fe,m)(1^)  ^?)0J2|P(^’  Tj'jd^dTj  (6.60) 

fc=0  m=k  P q 

These  N equations  for  the  discrete  second-order  outputs  can  be  written  in  matrix 
form  as 
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y2j  = [U 2][P]atriij2  (6.61) 

where  y2  is  a vector  of  discrete  second-order  outputs  and  atri is  a vector  of  single- 
scale kernel  coefficients.  The  quadratic  input  matrix  [P2]  takes  the  form 


ulo 

0 

0 

0 

0 

0 

0 •• 

0 

uh 

Uj,0Ujtl 

ulo 

0 

0 

0 

0 •• 

0 

uh 

uj,luj,2 

uh 

uj,0uj,2 

Uj,Ouj,l 

ulo 

0 ■■ 

0 

uIn-  1 

Uj,N-2Uj,N-l 

UIn- 2 

' • Ulo 

U2  ft 

Uj,N-lUj,N 

U2  ft 

:,n-  1 

' • uh 

u\n-  1 

Uj,N-2uj,N-l 

u\n- 2 

..  u2  - 

j,N-N 

Clearly,  [P2]  has  N rows.  Each  column  in  [U2]  corresponds  to  a different  characteristic 
function  in  fh  There  are  |(iV2  + N ) such  functions  in  f 1,  so  there  are  \{N2  + N) 
columns  in  [C/2].  The  entries  of  the  matrix  [ P ] correspond  to  the  values  of  the  integrals 

J 77)<A?2,p(£i  v)d^dr}  (6.62) 

Q 

Each  column  in  [P]  corresponds  to  a different  triangular  scaling  function  0j2i jp  while 
each  row  corresponds  to  a different  characteristic  function  Xj,{k,m)-  Therefore,  [P]  is 
a !(iV2  + N)  x T2  matrix.  Because  each  scaling  function  and  characteristic  function 
are  supported  over  a relatively  small  subregion  of  12,  most  of  the  entries  in  [P]  are 
zero.  Therefore,  in  practice,  it  is  not  efficient  to  calculate  [P]  explicitly.  Instead,  the 
entries  of  the  n x T2  matrix  product  [P2H-P]  are  calculated  directly  using  equation 
(6.60). 
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The  formulation  in  equation  (6.61)  does  not  exploit  the  multiscale  structure  of 
the  discrete  wavelet  transform  of  h2tri ■ Recall  that  the  single-scale  approximation 
h'2,tri,j2  can  be  written  in  an  equivalent,  multilevel  form 

J2-1  3 

h2,trij2  = PipMAZ’  v)  + <*<£(£ , v)  (6.63) 

1=0  P r= 1 

This  representation  is  in  terms  of  the  wavelets  on  levels  0 through  j2  — 1 and  the 
scaling  function  on  level  0.  The  summation  over  r reflects  the  fact  that  there  are 
three  different  types  of  triangular  wavelets.  By  construction,  the  coarsest-resolution 
level  is  level  0,  where  the  scaling  function  0 is  simply  a constant  function  over  Q.  As 
always,  the  multilevel  wavelet  coefficients  in  equation  (6.63)  can  be  calculated  from 
the  single-scale  coefficients  in  a fast,  efficient  manner.  Defining  a multiscale  vector  of 
coefficients 


= 


a 

a, 

a, 


A-2 

, -J2-1  . 


where  each  /?  , l = 0, . . . , j2  — 1,  represents  all  the  wavelet  coefficients  on  level  l,  we 
can  write 


P_2  = (6.64) 

In  equation  (6.64),  [T2\  represents  a decomposition  matrix  that  performs  the  wavelet 
transform.  Then,  we  have 


32,  ,r,  = 


(6.65) 
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where  [T2]  1 is  the  reconstruction  matrix.  Substituting  for  a2tri  in  equation  (6.61), 
we  obtain 


yaj  = [U2][P][T2]-%  (6.66) 

Just  as  for  the  first-order  kernel,  in  many  cases  the  wavelet  coefficients  in  P corre- 
sponding to  some  of  the  finer-resolution  levels  do  not  contribute  much  to  the  overall 
structure  of  the  kernel.  Then,  P can  be  truncated  to  1V2  nonzero  coefficients  and 
equation  (6.66)  becomes 


V2j  = (6.67) 

where  P2  is  the  truncated  vector  of  wavelet  coefficients  and  [T2]_1  represents  the  first 
N2  columns  of  [T2]_1. 

Given  the  models  for  the  discrete  outputs  of  the  first  and  second-order  Volterra 
operators: 


Mi,,-  = [cqtri]-1!,  = (6.68) 

V2J ■ = mPpVit  = IA,%  (6.69) 

the  total  output  can  be  written  as 


v~, = h,i + ^ 


(6.70) 


where  y is  a vector  of  sampled  outputs.  Then,  the  total  model  takes  the  form 


y_j  — [^i  ^2] 


& 

K 


[A)P 


(6.71) 
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Thus,  the  kernel  identification  procedure  reduces  to  a least-squares  problem  in  which 
equation  (6.71)  must  be  solved  for  the  wavelet  coefficients  in  (3  that  represent  the 
first  and  second-order  Volterra  kernels.  It  should  be  emphasized  that  the  first  and 
second-order  kernels  must  be  identified  simultaneously  because  the  Volterra  operators 
are  not  orthogonal. 


6.3  Multiwavelet  Kernel  Identification 

In  this  section,  a kernel  identification  algorithm  is  developed  in  which  the  first 
and  second-order  kernels  are  approximated  in  terms  of  the  piecewise-polynomial  mul- 
tiwavelets derived  in  Chapter  4.  Once  again,  the  output  of  the  first-order  Volterra 
operator  takes  the  form 

2/i (t)=[  hi(£)u(t  - £)d£  (6.72) 

Jo 

In  this  case,  however,  the  symmetric  form  of  the  second-order  kernel,  h2iSym,  is  used 
instead  of  the  triangular  form.  Then,  the  second-order  Volterra  operator  can  be 
written  as 


2/2 (t)  = [ [ h,2iSym(£,  r])u{t  — £)u(t  — r])d^dr]  (6.73) 

Jo  Jo 

where  the  symmetric  second-order  kernel  /i2iJJS/m  is  defined  over  a square  domain. 
It  is  demonstrated  later  that  the  symmetric  kernel  can  be  represented  using  two- 
dimensional  tensor  product  multiwavelets  that  have  been  adapted  to  the  square  do- 
main. 

First,  the  input  and  output  is  discretized  in  the  same  manner  as  in  the  last 
section.  A zero-order  hold  representation  of  the  input  is  written  as 

N- 1 

uj(£)  — 'y  ' uj,kXj,k(ti) 

fc= 0 

where  the  coefficients  {ujtk}  are  given  by 


(6.74) 
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ujtk  = 2-j/2u(2-jk),  k = 0,...,N- 1 (6.75) 

and  N is  the  total  number  of  samples.  As  before,  we  have  N equations  for  the  discrete 
first-order  outputs: 

yi,j(tn)  = f h (Owj^n-O^  n = 1, . . . , iV  (6.76) 

Jo 

where  tn  = 2_Jn.  Last  section,  the  following  expression  was  derived  for  the  input 

A?  (Li  O' 


n— 1 

Ujitn  0 ^ ^ fc— l(Q 

fc=0 
n— 1 

= ^ Uj, n-k-lXj,k(0  (6'77) 

k=0 

Next,  the  first-order  kernel  hi  is  approximated  in  terms  of  the  piecewise-polynomial 
multiwavelets.  A single-scale  representation  of  the  kernel  on  resolution  level  j\  is 
given  by 


'■«.«)  = EE  vv <6-78) 

p S=1 

where  r denotes  the  number  of  generators.  Recall,  from  Chapter  4,  that  r depends  on 
the  order  of  the  multiwavelets.  For  example,  there  are  four  piecewise-linear  scaling 
functions,  four  piecewise-quadratic  scaling  functions,  and  five  piecewise-cubic  scaling 
functions.  For  computational  purposes,  the  kernel  is  often  discretized  on  a coarser- 
resolution  level  than  the  input  and  output.  Once  again,  it  is  assumed  that  the  system 
has  finite  memory  and  the  first-order  kernel  can  be  truncated  at  time  T\.  The  kernel 
is  then  supported  over  the  domain  [0,  Ti]  and  the  scaling  functions  in  equation  (6.78) 
are  adapted  to  this  domain  in  the  manner  detailed  in  Section  4.6.  This  implies  that 
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p is  summed  from  0 to  2_J1Ti , with  an  additional  scaling  function  4>]l2h 
restricted  to  [0,  Ti].  The  scaling  function  is  also  restricted  to  [0,  Pi].  Hence  there 
are  a total  of  N\  = 2Jl\r  + 1 scaling  functions  in  the  single-scale  expansion  of  the 
first-order  kernel. 

Substituting  equations  (6.77)  and  (6.78)  into  equation  (6.76),  we  obtain 

N— 1 r 

^w-EEE^  -i  ^(Oxs&W  (6-79) 

k= 0 p s=l  ® 

Note  that,  due  to  the  fact  that  the  kernel  has  been  truncated,  the  sum  over  k has  an 
upper  limit  of  N — 1,  where  N = 2J'Xi,  and  the  upper  integration  limit  is  reduced  to 
T\.  More  precisely,  the  upper  limit  in  the  k summation  is  n — 1 for  n < N and  N — 1 
for  n > N.  Equation  (6.79)  can  be  written  in  matrix  form  as 

vhj  - miPfe,,  (6.80) 

where  yi  is  a vector  of  discrete  first-order  outputs  and  is  a vector  of  scaling 
function  coefficients  that  represent  the  first-order  kernel.  The  N x N input  matrix 
[Ui]  takes  the  form 


Ujfl 

0 

0 

0 

Uj,  1 

uj,  o 

0 

0 

Uj,N- 1 

Uj,N- 2 

Uj,N—3 

uj,0 

Uj,N 

Uj,N- 1 

Uj,N—‘2  ' ' 

■ • Wj,l 

uj,N- 1 

uj,N- 2 

uj,N-3  ■ ' 

Uj,N-N 

and  [Pi]  is  a N x Ah  matrix  whose  entries  take  the  form 
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Each  row  of  [Pi]  corresponds  to  a different  characteristic  function  Xj,k  while  each 
column  corresponds  to  a different  scaling  function  4>jltP-  It  should  be  noted  that  the 
matrix  [Pi]  is  never  explicitly  formed  in  practice,  but  rather  the  entries  of  the  IV  x Ah 
matrix  product  [Pi]  [Pi]  are  calculated  using  equation  (6.79). 

At  this  point,  we  take  advantage  of  the  multiscale  structure  of  the  discrete  wavelet 
transform  of  the  first-order  kernel.  The  wavelet  transform  yields  the  following  multi- 
level representation  of  the  kernel: 


where  jo  is  the  coarsest  level  used  in  the  representation.  Defining  a vector  of  multiscale 
coefficients 


l=jo  p s=l 


:=  < -jo+1 


we  can  write 


a = (uk 


(6.82) 


where  [Pi]  is  the  decomposition  matrix  that  performs  the  multiwavelet  transform. 
Then, 
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S*  = [' Ti] (6.83) 

and  equation  (6.80)  takes  the  form 

yhj  = (6-84) 

As  before,  it  can  often  be  assumed  that  the  wavelet  coefficients  corresponding  to  some 
of  the  fmer-resolution  levels  in  /?  can  be  neglected.  Then,  f3^  can  be  truncated  to 
Ni  < Ni  nonzero  terms.  This  results  in  the  reduced-order  model 

vtJ  = mPtm-%  (6.85) 

where  /?  is  the  truncated  vector  of  wavelet  coefficients  and  [Tj]”1  represents  the  first 
Ni  columns  of  [Tj]-1. 

Using  the  same  discretization  of  the  input  and  output,  the  discrete  second-order 
outputs  can  be  written  as 

ptn  ntn 

V2,j(tn)  = / h2,sym(£,,  r))uj{tn  - £)uj(tn  - v)dt,dr)  n — l,...,N  (6.86) 

do  do 

In  the  last  section,  it  was  shown  that  the  zero-order  hold  approximation  of  the  input 
yields  the  expression 


where 


n—  1 n— 1 

Aj (Ui  £)Uj(tn  — ?])  = 'y  ' 'y  ^ V) 

fc= 0 m= 0 


(6.87) 


^j,n—k— 1 U (2  ^ (TI 

*-i» 

(6.88) 

'Uj , n — m — 1 'll  (2  ^ (ti 

m — 1)) 

(6.89) 
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Next,  the  second-order  kernel  is  expanded  in  terms  of  the  two-dimensional  tensor 
product  multiwavelets.  Tensor  product  multiwavelets  were  discussed  in  detail  in 
Section  3.6.  A single-scale  approximation  of  the  second-order  kernel  on  resolution 
level  j-2  takes  the  form 


h 


2,J2 


te-7?)  = 


q s=l  v=l 


r r 


- EEEE 


(s,v)  q(s,v) 
32,(p,q)  j2,(p,q) 


(£,v) 


(6.90) 


p q s=l  v=l 

Once  again,  it  is  assumed  that  the  system  has  finite  memory  and  the  kernel  can  be 
truncated  at  some  time  T2.  The  two-dimensional  scaling  functions  in  equation  (6.90) 
are  adapted  to  the  [0,T2]  x [0,T2]  square  domain  in  the  manner  detailed  in  Section 
4.6.  Then,  there  are  a total  of  iV2  = (2j2T2r  + l)2  scaling  functions  in  the  single-scale 
expansion  of  the  second-order  kernel. 

Substituting  equations  (6.87)  and  (6.90)  into  equation  (6.86),  we  obtain  the 
following  expression  for  the  discrete  second-order  outputs: 


iV-l  r /-T2  nT2 

y2,j(tn)  = Uj,n-k-l'Uj,n-m-l&j2t(piq')  / / ^ t 

k,m= 0 p,q  s,v=l  ® ® 

(6.91) 

Because  the  second-order  kernel  has  been  truncated  at  time  T2,  the  upper  limits  in 
the  k and  m summations  have  been  lowered  to  IV  — 1,  where  N = 2 JT2,  and  the  upper 
integration  limits  have  been  changed  to  T2.  More  specifically,  the  upper  summation 
limits  for  k and  m are  n — 1 for  n < N and  N — 1 for  n > N.  Equation  (6.91)  can 
be  written  in  matrix  form: 

h,,  = M [Aloft 


(6.92) 
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where  y2.  is  a vector  of  discrete  second-order  outputs  and  aj2  is  a vector  of  two- 
dimensional  scaling  function  coefficients.  The  quadratic  input  matrix  [C/2]  takes  the 
form 


Ulo 

uh  ■ 

u2  K 

J,N-1 

U2fj 

J,N 

uIn-  1 

0 

uj,luj,0 

' ■ Uj,N—lUj,N—2 

Uj,NUj,N- 1 

■ • Uj,N-lUj,N-2 

0 

0 

' ' Uj,N-lUj,N-3 

Uj,NUj,N-2 

' ' uj,N-luj,N-3 

0 

0 

Uj,N-lUj,0 

uj,NujA 

' ' uj,N-luj,N-N 

0 

uj,Ouj,l 

' ' Uj,N-2Uj,N-l 

Uj,N-lUj,N  ' 

■ ■ Uj,N-2Uj,N-l 

0 

ulo  ■ 

U2- 
j,N-  2 

U2fj 

J,N- 1 

u^,N-2 

0 

0 

' ‘ Uj,N-2Uj,N-3 

Uj,N-lUj,N-2  ' 

' ' 'uj,7V-2'^j,7V-3 

0 

0 

Uj,N-2Ujfi 

- ’ uj,N—2'Uj'N—ff 

0 

0 

Uj,Ouj,N-l 

ujAuj,N 

‘ ‘ uj,N-NUj,N- 1 

0 

0 

ulo 

uh 

U2- 

j,N-N 

There  are  a total  of  N rows  in  [U2].  Each  column  of  [U2]  corresponds  to  a different 
two-dimensional  characteristic  function  Xj,(k,m)  in  the  [0,  T2]  x [0,  T2]  domain.  There 
are  a total  of  N2  such  functions;  therefore,  [U2\  has  N2  columns.  The  matrix  [P2]  is 
composed  of  integrals  of  the  form 


J7)Xj,(*,m)(£,  v)d£dr] 


(6.93) 
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where  each  row  corresponds  to  a different  characteristic  function  Xj,(k,m)  and  each  col- 
umn corresponds  to  a different  two-dimensional  scaling  function  Therefore, 

[P2]  is  an  N2  x N2  matrix. 

As  a final  step,  recall  that  the  single-scale  expansion  of  the  second-order  kernel 
can  be  written  equivalently  as  a multiscale  representation: 


h2,^,  v)  = aS’,(L)^o,p(o^0,g(^)  + 

p,q  s=l  ii=l  l=j0  p,q  s=l  u=l 

+ iz  E I’^tp^iqiv)  + Yl  pImMAww 

l=jo  P,Q  s=l  «=1  l=jo  P ><?  s=l  w=l 

= E E E <%L*%Ue’>>) + EEEE  ■ *> 

p,q  s=l  v=l  l=jo  P,Q  s=l  v=l 

+ E E EE««&  ’))+EEEE  ffflSrfffiit'V) 

£=jo  P,9  S=1  t)=l  (=J0  P,<7  S=1  K=1 

(6.94) 


which  is  in  terms  of  the  three  different  types  of  tensor  product  wavelets  {T1,  T2,  T3} 
on  levels  j0  through  j2  — 1 and  the  scaling  functions  on  the  coarsest  level  j0.  In  Section 
3.6,  it  was  demonstrated  that  there  are  fast  algorithms  whereby  the  coefficients  in  this 
multiscale  expansion  can  be  calculated  from  the  single-scale  coefficients.  Defining  a 
vector  of  multiscale  coefficients 


\ 
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in  which  all  the  coefficients  corresponding  to  a particular  level  have  been  grouped 
together  in  vector  form,  we  can  write 


i2  = Pi]%2  (6.95) 

where  [T2]  is  a decomposition  matrix  that  performs  the  two-dimensional  wavelet  trans- 
form. Similarly, 


a*  = py-'A,  (6.96) 

where  [T^]-1  is  the  reconstruction  matrix.  Substituting  into  equation  (6.92),  we  obtain 

«2,i  = Miypy-'g,  (6.97) 

Assuming  that  the  wavelet  coefficients  corresponding  to  some  of  the  finer-resolution 
levels  can  be  neglected,  the  multiscale  vector  f3  can  be  truncated  to  N2  nonzero 
terms.  Then,  we  obtain  the  following  model  for  the  second-order  output: 

y2j  = ra/yp2]-p2  (6.98) 

where  (5  is  the  truncated  vector  of  multiscale  coefficients  and  [T2]-1  represents  the 
first  N2  columns  of  [T2]_1. 

Given  the  models  for  the  discrete  outputs  of  the  first  and  second-order  Volterra 
operators: 


[EMinHiirA  = [MU, 

(6.99) 

[U2]\p2}[f2]-'p2  = [A2)02 

(6.100) 

the  sampled  output  can  be  written  as 
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2-1, J 2-2, j 


y,  = y , , + v, 


(6.101) 


Then,  we  obtain  a combined  model  of  the  form 


least-squares  sense  for  the  wavelet  coefficients  (3  that  represent  the  first  and  second- 
order  kernels.  Once  again,  it  is  necessary  to  identify  the  first  and  second-order  kernels 
simultaneously  because  the  terms  in  the  Volterra  series  are  not  orthogonal.  Recall, 
from  Section  2.3,  that  the  second-order  kernel  over  a square  domain  can  take  several 
forms,  where  only  the  symmetric  form  is  unique.  Therefore,  it  is  preferrable  to  have 
the  identified  second-order  kernel  /i2j2  in  symmetric  form.  This  is  easily  accomplished 
via  the  relationship 


The  ]2  subscript  is  left  on  the  terms  in  equation  (6.103)  to  denote  that  the  second- 
order  kernel  has  been  approximated  on  resolution  level  j2. 


In  the  previous  two  sections,  it  has  been  shown  that  both  wavelet-based  kernel 
identification  algorithms  can  be  reduced  to  a least-squares  problem  that  must  be 
solved  for  the  unknown  wavelet  coefficients.  That  is,  the  problem  can  be  formulated 
as 


^2,sym,j2  (£>  V)  2 {^2j2  (£>  ^)  T ^2,j2  (ffi  £)  } 


(6.103) 


6.4  The  Least-Squares  Problem 


[A]0  = y 


(6.104) 
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where  the  matrix  [A]  is  known,  y is  a vector  of  measured  outputs,  and  /3  is  a vector 
of  unknown  wavelet  coefficients  that  represents  the  first  and  second-order  Volterra 
kernels.  In  this  discussion,  it  is  assumed  that  [A]  is  an  m x n matrix,  y is  an  m- 
dimensional  vector,  and  /?  is  an  n-dimensional  vector.  Furthermore,  it  is  assumed 
that  the  system  is  overdetermined  (m  > n)  since,  in  general,  a large  number  of 
observations  yields  a more  accurate  least-squares  solution. 

Given  a solution  (3  to  equation  (6.104),  the  residual  v_  is  defined  as 

v:=y-[A]$  (6.105) 

Then,  a least-squares  solution  /3  can  be  found  that  minimizes  the  least-square  error 

e = 11^11  = \J  vTv  (6.106) 

In  other  words,  the  least-squares  solution  (3  satisfies 

Ilg-M/Tll  < Ils-Wll  (6107) 

A A ^ 

for  all  n-dimensional  vectors  (3  [111].  The  estimated  solution  [A]@  is  a weighted 
combination  of  the  columns  of  [H]  and  is  therefore  in  the  column  space  of  [A],  the 
space  spanned  by  the  columns  of  [A].  This  space,  denoted  Col  (A ),  is  a subset  of  the 
space  Rm  that  contains  all  m-dimensional  vectors.  That  is,  Col(A)  C Rm.  Thus, 

aj(;  a 

the  least-squares  solution  (3  is  the  value  of  (3  that  minimizes  the  difference  between 
yE  Rm  and  [A] (3  e Col(A).  It  is  a well-known  theorem  from  linear  algebra  that  the 
orthogonal  projection  of  y onto  Col(A) 

V*  = V™]Col(A)y  (6.108) 

A.  * 

minimizes  this  difference  [111].  Hence,  the  least-squares  solution  [3  satisfies 
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[A]P*  = y*  (6.109) 

Since  y*  is  the  orthogonal  projection  of  y onto  Col  (A),  it  follows  that  the  difference 
y_  — y_*  is  orthogonal  to  Col  (A).  Therefore,  each  column  of  [A]  must  be  orthogonal  to 

y-y*- 

aj (y  - y*)  = af(y-  [A]ff)  = 0,  * = 1, . . . , n 

where  cq  denotes  the  zth  column  of  [A],  Then,  we  can  write 


[A]T(y  - [A]$')  = 0 

which  leads  to  the  standard  normal  equation 


[A]T[A]fi*  = [. A]Ty_  (6.110) 

- 3|C 

which  can  be  solved  for  the  least-squares  solution  (5  . For  the  case  where  [A]  has  full 
rank,  or  equivalently  in  the  overdetermined  case,  Col  (A)  = Mm,  equation  (6.110)  has 
a unique  solution  given  by 

§:  = (\AnA\)-'[Afy_  (6.111) 

When  [^4]  is  an  ill-conditioned  matrix,  meaning  that  small  errors  in  the  entries  of  [R] 
can  result  in  large  errors  in  the  computed  solution,  factorization  techniques  such  as 
QR  factorization,  Cholesky  factorization,  and  the  singular  value  decompsition  (SVD) 
can  sometimes  be  used  to  generate  an  accurate  solution  to  the  least-squares  problem. 

The  identification  of  Volterra  kernels  falls  into  the  category  of  inverse  problems. 
That  is,  given  input/output  data  from  the  system,  the  task  is  to  identify  the  Volterra 
kernels  that  characterize  the  system.  This  is  in  contrast  to  direct  problems  where, 
given  a model  of  the  system  and  an  input,  the  goal  is  to  predict  the  system  response. 
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An  unfortunate  characteristic  of  inverse  problems  is  that  they  are  frequently  ill-posed 
in  the  sense  that  small  errors  in  the  matrix  [A]  and  the  output  measurement  vector 
y can  lead  to  unstable  solutions.  Hadamard  is  often  credited  with  formally  defining 
a well-posed  problem  to  be  one  for  which  a solution  exists,  the  solution  is  unique, 
and  the  solution  depends  continuously  on  the  data  [112].  The  last  requirement  spec- 
ifies that  small  errors  in  the  data  should  lead  to  only  small  errors  in  the  solution. 
An  ill-posed  problem,  then,  can  be  defined  as  any  problem  that  is  not  well-posed. 
There  is  a great  deal  of  literature  pertaining  to  ill-posed  problems,  including  books 
by  Groetsch  [113]  and  Hansen  [114].  The  matrix  formulations  of  the  wavelet-based 
kernel  identification  algorithms  always  include  small  errors  due  to  the  zero-order  hold 
discretization  of  the  input  and  output.  Furthermore,  when  working  with  experimen- 
tal data,  the  measured  output  always  contains  some  degree  of  noise.  Because  of  these 
errors  and  the  ill-posed  nature  of  the  problem,  a least-squares  solution  of  equation 
(6.104)  based  on  conventional  solution  techniques  is  often  very  unstable.  For  this 
reason,  regularization  methods  such  as  Tikhonov  regularization  and  the  truncated 
singular  value  decomposition  are  needed  in  order  to  generate  stable  solutions  to  these 
ill-posed  problems.  Tikhonov  regularization  effectively  replaces  an  ill-posed  problem 
with  a family  of  nearby  well-posed  problems  [113].  The  normal  equation,  equation 
(6.110),  is  written  in  the  regularized  form 

(HFM  + <*[/])  = [A\Ty  (6.112) 

where  a is  a regularization  parameter  and  [I]  is  the  identity  matrix.  Then,  the  least- 
squares  solution  f3^  can  be  obtained  as 


(6.113) 
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As  the  parameter  a approaches  0,  equation  (6.112)  approaches  the  original  normal 
equation.  Basically,  a is  chosen  to  be  large  enough  so  that  equation  (6.113)  yields  a 
stable  solution. 

The  regularization  technique  used  in  this  dissertation  for  solving  the  wavelet- 
based  kernel  identification  problem  is  the  truncated  singular  value  decomposition.  A 
discussion  of  the  singular  value  decomposition  can  be  found  in  any  elementary  linear 
algebra  text  [111],  [115].  The  singular  value  decomposition  (SVD)  of  an  m x n matrix 
[A]  is  a factorization  of  the  form 


M = [r][£][vr  (6.ii4) 

where  [U]  and  [V]  are  orthogonal  matrices.  The  m x n matrix  [E]  can  be  partitioned 


into  a diagonal  block  [D]  and  blocks  of  zero 

as 

[E]  = 

[D\r  x r 

Or  x ( n—r ) 

0 (m—r)  x r 

0(m— r)  x (n—r) 

The  diagonal  matrix  [D\  takes  the  form 


o\ 


[D)  = 


°2 


G r 

where  cq  > er2  > • • • > ar  are  the  ordered  singular  values  of  [A]  and  r is  the  rank 
of  [A].  The  singular  values  correspond  to  the  square  roots  of  the  eigenvalues  of  the 
symmetric  matrix  [A]T[A].  It  is  not  difficult  to  prove  that  all  eigenvalues  of  [A]r[A] 
are  real  and  positive  [111].  Therefore,  the  r singular  values  are  also  real  and  positive. 
Due  to  the  structure  of  [E],  it  is  clear  that  the  last  ( m — r ) columns  of  [U]  and  the  last 
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(n  — r)  rows  of  [V]T  do  not  contribute  to  [A].  Then,  the  SVD  of  [A]  can  be  written 
in  the  reduced  form 


[A]  = [Ur][D][Vr]T  (6.115) 

where  [Ur]  and  [VJ.]  denote  the  first  r columns  of  [U]  and  [V]  respectively.  Equation 
(6.115)  can  then  be  used  to  define  the  pseudoinverse  of  [A],  given  by 

[A)+  = [Vr][D]-\Ur]r  (6.116) 

where  the  fact  that  [U]  and  [V]  are  orthogonal  matrices  has  been  used.  The  matrix 
[D]-1  is  a diagonal  matrix  composed  of  the  reciprocals  of  the  singular  values  of  [A]. 
It  is  not  difficult  to  verify  that  the  pseudoinverse  [A]+  acts  as  an  inverse  in  the  sense 
that 


[Ami  = [a][a]+  = [/] 

The  pseudoinverse  can  be  used  to  compute  a least-squares  solution  to  equation  (6.104) 
of  the  form 


l = [A\+y_  (6.117) 

It  can  be  demonstrated  that  the  solution  given  by  equation  (6.117)  is  indeed  a least- 
squares  solution  [111].  The  pseudoinverse  can  alternatively  be  expressed  as 

w+ = v (6-118) 

i 

i=l 

where  ut  and  represent  the  zth  columns  of  [U]  and  [V].  Then,  the  least-squares 


solution  can  be  written  as 
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(6.119) 


Ill-posed  problems  are  characterized  by  the  fact  that  the  singular  values  {cr^ } 
approach  zero  as  % approaches  infinity.  Therefore,  equation  (6.119)  tends  to  yield 
unstable  solutions  due  to  the  — factors.  The  truncated  singular  value  decomposition 
is  a regularization  technique  whereby  the  influence  of  some  of  the  smaller  singular 
values  is  removed  by  truncating  the  summation  in  equation  (6.119)  [113]: 


The  choice  of  truncation  level  p in  equation  (6.120)  is  not  always  a straightforward 
matter.  If  p is  chosen  to  be  too  high  (i.e. , too  many  singular  values  are  retained), 
the  resulting  solution  will  be  unstable.  Conversely,  if  too  few  singular  values  are 
kept,  the  solution  will  be  inaccurate  because  too  much  information  is  lost  and  the 
regularized  problem  does  not  closely  resemble  the  original  problem.  Determining  an 
optimal  cutoff  point  often  requires  viewing  a plot  of  the  singular  values  and  trying 
different  truncation  points.  Qualitatively,  an  effective  cutoff  point  is  often  indicated 
by  a leveling  off  of  the  singular  values  followed  by  a precipitious  decline.  In  this  case, 
just  before  the  singular  values  decline  often  serves  as  a good  truncation  point.  In 
their  text,  Colton  and  Kress  [112]  discuss  the  determination  of  the  minimum  number 
of  singular  values  needed  to  ensure  that  the  norm  of  the  estimation  error  is  less  than 
a given  value  S.  A theorem  is  given  that  guarantees  that  such  a truncation  point 
exists  for  any  choice  of  5.  In  any  event,  the  truncated  singular  value  decomposition 
has  proven  to  be  an  effective  tool  for  obtaining  stable  solutions  to  the  least-squares 
problem  and  extracting  wavelet-based  kernel  approximations. 


(6.120) 
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6.5  Implementation 

Both  wavelet-based  kernel  identification  algorithms  discussed  in  this  chapter  are 
implemented  using  MATLAB  programs.  In  this  section,  some  of  the  details  of  these 
implementations  are  discussed.  Then,  in  the  following  chapter,  several  examples  will 
be  given. 

The  first  step  in  both  implementations  is  to  load  the  input  and  output  data. 
The  algorithms  require  a dyadic  sampling  rate  for  the  data;  that  is,  the  sampling  rate 
must  be  Hz,  j E Z.  When  the  data  is  obtained  via  simulations,  this  requirement 
is  easily  satisfied.  If  however,  the  data  is  obtained  experimentally,  the  sampling  rate 
might  not  be  dyadic.  In  this  case,  a separate  MATLAB  routine  projects  the  data  onto 
Haar  scaling  functions  on  a chosen  level  j E Z.  This  effectively  generates  a zero-order 
hold  approximation  of  the  input  and  output  with  a sampling  step  of  2~K  The  user 
sets  several  parameters  for  both  kernel  identification  algorithms.  The  sampling  level 
j must  be  provided,  as  well  as  the  starting  time  (usually  chosen  to  be  0)  and  final 
time  for  the  input/output  data  that  is  to  be  used  for  the  identification.  The  user 
also  sets  the  finest-scale  approximation  levels  for  the  first  and  second-order  Volterra 
kernels.  For  the  triangular  wavelet  implementation,  the  finest-scale  level  for  the 
Haar  representation  of  the  first-order  kernel  is  automatically  chosen  to  be  the  same 
as  the  discretization  level  of  the  input  and  output.  The  discretization  level  for  the 
second-order  triangular  kernel  is  also  chosen.  For  the  multiwavelet  implementation, 
separate  discretization  levels  are  set  for  the  first  and  second-order  kernels.  These 
levels  are  almost  always  selected  to  be  coarser  than  the  input/output  sampling  level 
due  to  computational  considerations.  In  addition,  the  user  chooses  whether  to  use 
piecewise-linear,  quadratic,  or  cubic  multiwavelets  to  approximate  the  kernels.  The 
assumption  of  finite  memory  implies  that  the  kernels  decay  to  zero  in  a finite  period 
of  time.  Therefore,  time  durations  must  also  be  chosen  for  the  first  and  second- 
order  kernels.  In  several  of  the  simulation  examples  given  in  the  next  chapter,  the 
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kernels  are  known  in  analytical  form.  Therefore,  the  user  knows  a priori  the  effective 
time  durations  of  these  kernels.  In  the  more  general  case  where  such  knowledge  is 
not  available,  the  time  durations  are  determined  via  trial  and  error.  It  is  usually 
most  effective  to  start  with  relatively  large  kernels.  From  the  plots  of  the  identified 
kernels,  it  can  easily  be  determined  if  the  kernels  can  be  truncated  at  earlier  times. 
Finally,  the  user  must  specify  the  coarsest  level  jo  to  be  used  in  the  multiscale  wavelet 
representations  of  the  kernels.  Usually,  this  is  chosen  to  be  the  coarsest  possible  level 
to  give  the  largest  number  of  different  scales. 

Once  the  parameters  for  the  model  have  been  set,  the  coefficient  matrices  [Ai] 
and  [A2\  are  formed.  In  the  triangular  implementation,  the  first-order  input  matrix 
[Ui]  is  formed  as  described  in  Section  6.2.  Reconstruction  functions  have  been  writ- 
ten in  MATLAB  for  the  Haar  wavelet  and  the  triangular  wavelets.  These  routines 
peform  the  wavelet  reconstruction  via  a recursive  application  of  equation  (3.89)  for 
the  Haar  wavelet  and  equation  (5.93)  for  the  triangular  wavelets.  The  reconstruction 
matrix  [Tf]-1  is  then  formed  column  by  column  by  sending  a column  vector  of  zeros 
with  a 1 in  the  ith  row  to  the  Haar  reconstruction  function.  This  effectively  sifts 
out  the  ith  column  of  the  reconstruction  matrix  [Ti]-1.  The  triangular  reconstruc- 
tion matrix  [T2]_1  is  constructed  in  a similar  manner.  In  the  second-order  model, 
the  matrix  [C/2]  in  equation  (6.67)  is  formed  together  rather  than  forming  [U2\ 
and  [P2\  explicitly  and  multiplying  them.  This  is  done  for  computational  efficiency 
since  the  matrix  [P2]  is  sparse.  In  order  to  compute  this  matrix  product,  integrals  of 
two-dimensional  characteristic  functions  and  triangular  scaling  functions  of  the  form 
in  equation  (6.62)  must  be  calculated.  This  is  not  a simple  task  because  the  indexing 
of  the  triangular  scaling  functions  is  rather  complex.  The  calculation  of  the  integrals 
is  accomplished  via  a subroutine  that  loops  over  the  scaling  functions  and  determines 
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which  characteristic  functions  overlap  the  given  scaling  function.  Then,  the  corre- 
sponding inner  products  are  calculated  and  equation  (6.60)  is  used  to  generate  the 
entry  of  [t/2]  [-^2]  corresponding  to  that  particular  scaling  function. 

In  the  multiwavelet  implementation,  both  the  first  and  second-order  models  re- 
quire the  computation  of  integrals  of  the  products  of  scaling  functions  and  characteris- 
tic functions.  The  limits  on  these  integrations  is  dependent  upon  the  difference  in  the 
resolution  levels  of  the  kernel  and  the  input.  Unlike  the  triangular  implementation, 
where  the  integrals  were  of  products  of  constant  functions,  the  integrals  in  the  mul- 
tiwavelet implementation  involve  products  of  piecewise-polynomial  scaling  functions 
and  characteristic  functions.  Therefore,  a separate  quadrature  subroutine  evaluates 
the  one-dimensional  integrals  numerically  using  Gauss-Legendre  quadrature.  The  re- 
sulting values  of  these  integrals  over  one  2~n  block  are  stored  in  an  array,  where  j\ 
is  the  resolution  level  of  the  first-order  kernel.  In  order  to  do  this,  the  one  scaling 
function  that  has  support  over  [—1,1]  is  treated  as  two  “half  functions.”  It  is  only 
necessary  to  perform  these  integrations  over  one  block  since  integrals  over  subse- 
quent blocks  have  the  same  values.  Then,  the  array  of  integral  values  is  of  dimension 
(r  + 1)  x 2J_?1,  where  2J-Jl  is  the  number  of  characteristic  functions  that  intersect 
a single  scaling  function.  Because  the  two-dimensional  integrals  involve  tensor  prod- 
uct scaling  functions  and  characteristic  functions,  they  are  actually  comprised  of  the 
product  of  two  one-dimensional  integrals.  Therefore,  the  two-dimensional  integrals 
are  evaluated  by  first  generating  an  (r  + 1)  x 2-?_'72  array  of  one-dimensional  integrals 
of  scaling  functions  and  characteristic  functions,  where  is  the  resolution  level  of  the 
second-order  kernel.  Then,  the  two-dimensional  integrals  are  calculated  by  multiply- 
ing the  appropriate  two  entries  in  the  array  of  one-dimensional  integral  values.  The 
entries  of  the  matrix  products  [£/i][Pi]  and  are  then  calculated  using  equa- 

tions (6.79)  and  (6.91)  respectively.  Just  as  in  the  triangular  case,  reconstruction 
functions  have  been  written  for  the  multiwavelets  in  one  and  two  dimensions.  These 
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routines  perform  the  wavelet  reconstruction  through  a recursive  application  of  equa- 
tion (3.157)  for  the  one-dimensional  case  and  equation  (3.184)  in  the  two-dimensional 
case.  Because  the  multiwavelets  are  adapted  to  fit  the  domains  of  support  of  the  first 
and  second-order  kernels,  slight  modifications  have  been  made  to  these  equations  to 
account  for  the  boundary  coefficients.  The  one-dimensional  multiwavelet  reconstruc- 
tion matrix  [Tl]-1  is  generated  column  by  column  using  the  same  procedure  as  the 
Haar  case.  Unfortunately,  the  two-dimensional  case  is  not  as  simple  as  there  is  no 
convenient  matrix  representation  of  the  tensor  product  multiwavelet  reconstruction. 
Hence,  in  practice,  the  matrix  [T^]-1  is  not  calculated  and  the  single-scale  coefficients 
(*j2  are  left  in  the  model. 

In  both  implementations,  once  the  first  and  second-order  models  are  formed,  they 
are  combined  into  one  matrix.  The  resulting  matrix  equation  is  then  solved  for  the 
wavelet  coefficients  that  represent  the  first  and  second-order  kernels.  As  discussed 
in  the  previous  section,  the  least-squares  solution  is  obtained  using  the  truncated 
singular  value  decomposition.  The  singular  value  decomposition  is  computed  using 
the  MATLAB  “svd”  function  and  the  singular  values  are  then  plotted.  The  user 
then  determines  where  to  truncate  the  singular  values  and  the  pseudoinverse  in  equa- 
tion (6.120)  is  calculated  using  the  MATLAB  “pinv”  function.  The  pseudoinverse 
is  used  to  solve  for  the  wavelet  coefficients.  In  order  to  plot  the  resulting  kernels, 
the  single-scale  coefficients  are  reconstructed  from  the  multiscale  wavelet  coefficients 
using  the  reconstruction  functions.  The  kernels  are  then  plotted  in  terms  of  the  scal- 
ing functions  using  plotting  routines  that  have  been  developed  for  the  Haar  scaling 
functions,  triangular  scaling  functions,  and  multiwavelet  scaling  functions  in  one  and 
two  dimensions.  The  first  and  second-order  outputs  corresponding  to  the  identified 
Volterra  kernels  are  also  calculated.  These  outputs,  as  well  as  the  total  predicted 
output,  are  plotted  in  comparison  to  the  actual  sampled  output.  The  main  advan- 
tage of  using  the  multiscale  wavelet  coefficients  in  the  model  is  that  the  coefficients 
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corresponding  to  certain  finer-resolution  levels  can  easily  be  neglected  via  truncation. 
Hence,  the  number  of  nonzero  wavelet  coefficients  to  be  used  in  the  representation  of 
the  first  and  second-order  kernels  is  a user-defined  parameter.  Usually,  it  makes  sense 
to  generate  the  full-order  model  first.  Then,  the  number  of  nonzero  coefficients  can 
be  systematically  reduced.  This  corresponds  to  eliminating  certain  columns  in  the 
full-order  matrix.  In  this  manner,  one  can  determine  the  fewest  number  of  wavelet 
coefficients  needed  to  represent  the  first  and  second-order  kernels.  Of  course,  the 
approximations  of  the  kernels  should  improve  as  the  number  of  nonzero  coefficients 
is  increased;  however,  including  more  wavelet  coefficients  than  is  necessary  can  ac- 
tually make  it  more  difficult  to  generate  a stable  least-squares  solution.  As  a final 
note  regarding  the  two-dimensional  multiwavelet  coefficients,  the  effect  of  having  the 
single-scale  coefficients  in  the  model  instead  of  the  multiscale  coefficients  is  that  the 
vector  of  coefficients  can  no  longer  be  truncated  to  neglect  certain  levels  in  the  ap- 
proximation. The  number  of  nonzero  terms  must  be  varied  by  changing  the  initial 
resolution  level  of  the  kernel.  For  example,  if  a multiscale  representation  is  desired 
over  levels  j0  through  j0  + 2 only,  the  resolution  level  for  the  single-scale  coefficients 
should  be  set  to  j0  + 3.  The  two  representations  are  equivalent.  Thus,  by  varying 
the  initial  resolution  level  of  the  kernel,  the  number  of  nonzero  wavelet  coefficients 
is  also  varied.  The  disadvantage  of  having  to  do  this  is  that  the  second-order  model 
must  be  regenerated  every  time  as  opposed  to  just  truncating  the  original  full-order 
model. 

At  this  point,  several  comments  should  be  made  regarding  the  computational 
limitations  of  these  algorithms.  In  both  cases,  the  main  limiting  factor  is  the  total 
number  of  unknown  coefficients  to  be  determined.  There  is  a limit  to  the  size  of 
a given  matrix  for  which  MATLAB  can  generate  a singular  value  decomposition. 
In  general,  this  limit  occurs  at  around  2000  coefficients.  Of  course,  the  number  of 
input/output  data  points  is  also  a factor.  This  computational  limit  sets  a constraint 
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on  the  resolution  levels  that  can  be  chosen  for  the  first  and  second-order  kernels. 
The  truncation  of  the  kernels  is  an  important  parameter  since  using  shorter-duration 
kernels  reduces  both  the  number  of  coefficients  in  the  model  and  the  computational 
time  to  form  the  model  matrix.  This,  in  turn,  enables  the  user  to  choose  higher 
levels  of  resolution  for  the  kernels.  Therefore,  it  is  always  advantageous  to  use  the 
shortest  possible  time  durations  for  the  kernels.  It  should  also  be  noted  that,  for  the 
triangular  wavelets,  the  resolution  of  the  triangular  kernel  is  directly  dependent  on 
the  size  of  the  kernel  as  well  as  the  chosen  resolution  level.  This  is  due  to  the  method 
of  construction  of  the  triangular  wavelets.  For  example,  a finest-resolution  level  of  4 
yields  a single-scale  approximation  of  the  triangular  kernel  in  terms  of  44  = 256  scaling 
functions.  Clearly,  this  number  of  coefficients  provides  better  resolution  for  a kernel 
with  a duration  of  1 second  than  for  one  with  a time  duration  of  8 seconds.  In  the 
multiwavelet  implementation,  the  maximum  resolution  level  is  limited  in  part  by  the 
fact  that  there  are  multiple  scaling  functions  and  wavelets.  Given  r scaling  functions 
and  wavelets,  a representation  of  the  first-order  kernel  requires  approximately  r times 
the  number  of  functions  as  a Haar  approximation  at  the  same  level.  This  difference 
is  offset,  however,  by  the  fact  that  the  multiwavelets  have  higher  approximation 
order  than  the  Haar  wavelet.  Therefore,  an  accurate  representation  of  the  kernel  can 
be  achieved  at  a lower  resolution  level  using  the  multiwavelets.  In  any  event,  the 
current  computational  limits  of  the  MATLAB  implementations  put  an  effective  limit 
on  the  size  of  the  kernels  that  can  be  identified.  In  general,  this  limit  is  about  32 
seconds  for  the  first-order  kernel  and  8 seconds  for  the  second-order  kernel.  Despite 
these  limitations,  as  will  be  demonstrated  in  the  next  chapter,  first  and  second-order 
kernels  can  be  identified  for  a large  number  of  systems.  In  many  cases,  accurate 
estimates  of  the  kernels  can  be  obtained  with  a relatively  small  number  of  wavelet 
coefficents. 


CHAPTER  7 
NUMERICAL  RESULTS 


In  this  chapter,  the  wavelet-based  kernel  identification  algorithms  developed  in 
the  last  chapter  are  demonstrated  on  several  systems.  First,  a damped  linear  oscillator 
is  considered.  This  linear  system  can  be  completely  characterized  in  terms  of  the 
first-order  Volterra  kernel  which,  for  a linear  system,  is  equivalent  to  the  impulse 
response  function.  The  wavelet-based  algorithms  are  used  to  identify  the  first-order 
kernel  of  this  system.  Also,  the  effect  of  varying  kernel  length  is  investigated  for 
this  system.  The  next  example  is  a quadratic  nonlinear  system  whose  dynamics 
are  completely  described  in  terms  of  the  second-order  kernel.  This  system  serves 
as  a good  validation  case  for  the  second-order  parts  of  the  two  kernel  identification 
algorithms.  Next,  a nonlinear  oscillator  is  analyzed.  This  system  consists  of  a linear 
oscillator  with  an  added  quadratic  nonlinearity.  First  and  second-order  kernels  are 
identified  for  this  system.  The  identified  kernels  are  then  validated  by  comparing 
the  output  predicted  by  the  kernels  to  the  simulated  output  of  the  system  under  a 
variety  of  inputs.  As  a final  example,  Volterra  kernels  are  extracted  from  flight  flutter 
data  from  the  Aerostructures  Test  Wing,  which  was  flight-tested  at  NASA  Dryden  in 
2001.  It  is  demonstrated  that  the  wavelet-based  algorithms  can  be  used  to  identify 
kernels  from  noisy  flight  data.  This  example  also  provides  a practical  application  of 
these  kernel  identification  techniques  as  the  identified  first-order  kernels  can  be  used 
to  obtain  improved  flutter  estimates  for  aeroelastic  systems. 

7.1  Linear  Oscillator 

In  this  section,  first-order  Volterra  kernels  are  identified  for  a damped  linear 
oscillator.  The  equation  of  motion  for  this  system  is  given  by 
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my(t)  + cy(t)  + ky(t)  = u(t ) (7.1) 

where  y is  the  output  displacement  and  u is  the  input  force.  The  system  parameters 
are  the  mass  m,  the  damping  coefficient  c,  and  the  spring  stiffness  k.  In  this  example, 
these  parameters  have  been  chosen  as 


m — 2 kg 

c = 3.2  N • s/m  (7.2) 

k = 32  N/m 

The  initial  position  y( 0)  and  the  initial  velocity  y( 0)  are  both  chosen  to  be  zero.  These 
zero  initial  conditions  ensure  that  the  zero-order  kernel  h0  is  zero  in  this  example. 
Equation  (7.1)  is  commonly  written  in  the  form 


y(t)  + 2C  uny{t)  + u*y{t ) = ~u(t) 

where  the  damping  ratio  ( and  the  natural  frequency  ton  are  given  by 


(7.3) 


^ n 


(7.4) 


For  the  system  parameters  given  in  equation  (7.2),  we  have 


(7.5) 


ujn  — 4 rad/s  = .637  Hz 


C = -2 


(7.6) 
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FIRST-ORDER  KERNEL 


Figure  7.1:  Linear  Oscillator:  First-Order  Volterra  Kernel 


The  dynamics  of  this  linear  system  can  be  completely  described  in  terms  of  the 
first-order  Volterra  kernel.  For  a linear  system,  the  first-order  kernel  is  equivalent 
to  the  impulse  response  of  the  system.  The  impulse  response  of  equation  (7.1)  is  a 
well-known  result  from  vibration  theory  [116]  and  takes  the  form 

hi(t)  — g-C^nt  sindn^t)  (7.7) 

mujd 

where  ud  is  the  damped  natural  frequency,  defined  as 

ud  = un^l  - C2  (7.8) 

In  this  example,  uid  — 3.92  rad/s  or  .624  Hz.  The  first-order  kernel  for  this  system 
is  depicted  in  Figure  (7.1). 

The  wavelet-based  kernel  identification  algorithms  are  used  to  extract  the  first- 
order  kernel  from  simulated  input/output  data  from  the  system.  The  input  to  the 
system  is  chosen  to  be  a signal  composed  of  three  sinusoids: 


u(t)  = A\  sin(27rflif)  + A2  sin(27rfi2t)  + A3  sin(27r03t) 


(7.9) 


EMENT  (m)  DISPLACEMENT 


243 


INPUT 


Figure  7.2:  Linear  Oscillator:  Input 


OUTPUT 


TIME  (s) 


Figure  7.3:  Linear  Oscillator:  Simulated  Output 
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where 


{AuA2lA3}  = {1,2,10}  N 

{fir.STa.fia}  = {4,8,16}  Hz  (7.10) 

This  input  is  shown  in  Figure  (7.2).  For  clarity,  only  the  first  2 seconds  of  the  input 
have  been  plotted.  The  corresponding  output  is  obtained  via  a MATLAB  simula- 
tion and  is  depicted  in  Figure  (7.3).  In  total,  there  are  16  seconds  of  input/output 
data  sampled  at  256  Hz.  Therefore,  the  discretization  level  for  the  zero-order  hold 
approximation  of  the  input  and  output  is  j = 8.  From  Figure  (7.1),  it  is  apparent 
that  the  first-order  kernel  decays  to  zero  in  approximately  8 seconds.  Therefore,  the 
kernel  length  is  set  to  8 seconds  in  the  kernel  identification  algorithms.  The  identified 
first-order  kernels  from  the  Haar  identification  are  shown  in  Figure  (7.4)  for  a varying 
number  of  nonzero  terms.  Recall  that  the  number  of  nonzero  terms  is  dependent  on 
the  number  of  resolution  levels  included  in  the  multiscale  approximation  of  the  kernel, 
as  well  as  the  time  duration  of  the  identified  kernel.  Similar  plots  of  the  identified 
kernels  from  the  multiwavelet  implementation  are  shown  in  Figures  (7.5)  through 
(7.7).  These  figures  correspond  to  the  kernel  identification  using  the  piecewise- linear, 
quadratic,  and  cubic  multiwavelets,  respectively.  For  each  of  the  kernel  plots,  the  res- 
olution level  of  the  identified  kernel  is  given  as  well  as  the  corresponding  number  of 
nonzero  terms.  The  coarsest  possible  level  in  the  multiscale  representation  of  the  ker- 
nels is  level  j0  = —3,  which  corresponds  to  a grid  size  of  2~3  = 8 seconds,  the  duration 
of  the  identified  kernels.  Then,  as  an  example,  a level  5 approximation  of  the  kernel 
corresponds  to  a single-scale  expansion  in  terms  of  scaling  functions  on  level  5 or, 
equivalently,  a multilevel  expansion  in  terms  of  wavelets  on  levels  4 through  —3  and 
scaling  functions  on  level  —3.  In  each  of  the  plots,  the  identified  kernel  is  represented 
by  a solid  line  while  the  analytical  kernel  is  depicted  with  a dashed  line.  The  output 
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predicted  by  each  of  the  identified  kernels  is  shown  in  Figures  (7.8)  through  (7.11). 
For  clarity,  only  the  first  8 seconds  of  identified  output  is  shown.  Once  again,  in  each 
plot  the  solid  line  denotes  the  predicted  output  while  the  dashed  line  represents  the 
actual  simulated  output. 

In  all  cases,  the  identified  kernels  and  the  corresponding  outputs  converge  to 
the  true  analytical  results  as  the  number  of  nonzero  terms  is  increased.  This  is  ex- 
pected since  the  results  should  improve  as  more  levels  of  resolution  are  included  in 
the  multiscale  representation  of  the  kernel.  This  example  illustrates  the  effect  of  the 
approximation  order  of  the  wavelets  on  the  identified  kernels  and  outputs.  From  Fig- 
ure (7.4),  it  can  be  seen  that  the  Haar  identification  of  the  kernel  begins  to  resemble 
the  form  of  the  analytical  kernel  when  the  resolution  level  is  increased  to  2,  corre- 
sponding to  32  nonzero  terms  in  the  representation.  The  identified  kernel  does  not 
closely  match  the  analytical  kernel,  however,  until  the  resolution  level  is  increased  to 
5,  a 256-term  approximation.  This  fact  is  mirrored  in  the  corresponding  identified 
outputs  in  Figure  (7.8)  where  the  identified  output  does  not  converge  to  the  actual 
output  until  a resolution  level  of  5 is  reached.  It  is  interesting  to  note  that,  although 
the  error  in  the  128-term  identification  of  the  kernel  does  not  appear  to  be  large,  the 
resulting  identified  output  clearly  shows  significant  error.  This  error  occurs  mainly 
in  the  first  2 seconds  of  the  output,  which  is  influenced  only  by  the  first  2 seconds 
of  the  identified  kernel.  This  is  the  portion  of  the  identified  kernel  that  is  most  in 
error.  That,  coupled  with  the  relatively  high  frequency  of  the  input  signal,  results  in 
the  high  frequency  oscillations  in  the  beginning  of  the  identified  output.  The  multi- 
wavelet identifications  of  the  first-order  kernel,  shown  in  Figures  (7.5)  through  (7.7), 
demonstrate  that,  compared  to  the  Haar  identification,  fewer  levels  of  resolution  are 
needed  to  converge  to  the  analytical  kernel.  This  can  be  attributed  to  the  fact  that 
the  multiwavelets  possess  higher  approximation  order  than  the  Haar  wavelet.  The 
piecewise-linear  multiwavelet  identification  resembles  the  analytical  kernel  fairly  well 
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Level  -3  Identification  (1  Term)  Level  -2  Identification  (2  Terms) 


Level  -1  Identification  (4  Terms)  Level  0 Identification  (8  Terms) 


Level  1 Identification  (16  Terms)  Level  2 Identification  (32  Terms) 


Level  3 Identification  (64  Terms)  Level  4 Identification  (128  Terms) 


Level  5 Identification  (256  Terms)  Level  6 Identification  (512  Terms) 
Figure  7.4:  Haar  Wavelet  Identification  of  the  First-Order  Volterra  Kernel 
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FIRST  -ORDER  KERNEL 


Level  -3  Identification  (5  Terms) 


Level  -2  Identification  (9  Terms) 


Level  -1  Identification  (17  Terms) 


Level  1 Identification  (65  Terms) 


FIRST-ORDER  KERNEL 


Level  0 Identification  (33  Terms) 


FIRST  -ORDER  KERNEL 


Level  2 Identification  (129  Terms) 


Level  3 Identification  (257  Terms) 


Level  4 Identification  (513  Terms) 


Figure  7.5:  Piecewise-Linear  Multiwavelet  Identification  of  the  First-Order  Volterra 
Kernel 
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Level  -3  Identification  (5  Terms)  Level  -2  Identification  (9  Terms) 


Level  -1  Identification  (17  Terms)  Level  0 Identification  (33  Terms) 


Level  1 Identification  (65  Terms) 


Level  2 Identification  (129  Terms) 


Level  3 Identification  (257  Terms) 


Level  4 Identification  (513  Terms) 


Figure  7.6:  Piecewise-Quadratic  Multiwavelet  Identification  of  the  First-Order 

Volterra  Kernel 
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HOST  ORDER  KERNEL 


Level  -3  Identification  (6  Terms) 


Level  -1  Identification  (21  Terms) 


Level  -2  Identification  (11  Terms) 


Level  0 Identification  (41  Terms) 


Level  1 Identification  (81  Terms) 


Level  2 Identification  (161  Terms) 


FIRST-ORDER  KERNEL 


Level  3 Identification  (321  Terms) 


Level  4 Identification  (641  Terms) 


Figure  7.7:  Piecewise-Cubic  Multiwavelet  Identification  of  the  First-Order  Volterra 
Kernel 


Level  -3  Identification  (1  Term) 


Level  -1  Identification  (4  Terms) 


Level  1 Identification  (16  Terms) 


Level  3 Identification  (64  Terms) 


Level  -2  Identification  (2  Terms) 


Level  0 Identification  (8  Terms) 


Level  2 Identification  (32  Terms) 


Level  4 Identification  (128  Terms) 


I 


Level  5 Identification  (256  Terms)  Level  6 Identification  (512  Terms) 
Figure  7.8:  Haar  Wavelet  Identification  of  the  Output 
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Level  -3  Identification  (5  Terms)  Level  -2  Identification  (9  Terms) 


Figure  7.9:  Piecewise-Linear  Multiwavelet  Identification  of  the  Output 
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Level  -3  Identification  (5  Terms)  Level  -2  Identification  (9  Terms) 


Level  -1  Identification  (17  Terms) 


Level  0 Identification  (33  Terms) 


Level  1 Identification  (65  Terms) 


Level  2 Identification  (129  Terms) 


Level  3 Identification  (257  Terms)  Level  4 Identification  (513  Terms) 


Figure  7.10:  Piecewise-Quadratic  Multiwavelet  Identification  of  the  Output 
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Level  -3  Identification  (6  Terms)  Level  -2  Identification  (11  Terms) 


Level  1 Identification  (81  Terms) 


Level  2 Identification  (161  Terms) 


Figure  7.11:  Piecewise-Cubic  Multiwavelet  Identification  of  the  Output 
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once  the  resolution  level  is  increased  to  0,  corresponding  to  33  nonzero  terms.  Once 
the  resolution  level  is  increased  to  2 (129  terms),  the  identified  kernel  can  no  longer 
be  distinguished  from  the  analytical  kernel  in  the  figure.  The  piecewise-quadratic 
multiwavelet  identification  shows  slightly  improved  results.  The  level  0 (33-term) 
identification  of  the  kernel  closely  resembles  the  analytical  kernel  and  the  level  2 
(65-term)  identification  cannot  be  discerned  from  the  analytical  kernel  in  the  figure. 
Similar  results  are  observed  for  the  piecewise-cubic  multiwavelet  identification  of  the 
kernel  in  Figure  (7.7)  where,  once  again,  the  level  0 (41-term)  identification  closely 
matches  the  analytical  kernel  and  the  level  2 (161-term)  identification  completely 
converges  to  it.  From  the  plots  of  the  piecewise-linear  identification  of  the  output  in 
Figure  (7.9),  it  can  be  observed  that  the  level  1 identification  matches  the  actual  out- 
put fairly  well  but  does  not  completely  converge  until  the  resolution  level  is  increased 
to  3 (257  terms).  On  the  other  hand,  the  piecewise-quadratic  and  piecewise-cubic 
output  identifications,  shown  in  Figures  (7.10)  and  (7.11),  both  closely  resemble  the 
actual  output  at  a resolution  level  of  0.  In  both  cases,  once  the  resolution  level  is 
increased  to  1,  the  identified  output  completely  converges  to  the  actual  output.  In 
comparing  the  Haar  results  to  the  multiwavelet  results,  it  is  clear  that  fewer  levels  of 
resolution  are  required  for  kernel  identification  with  the  multiwavelets.  However,  at 
each  resolution  level,  the  multiwavelet  identification  involves  more  terms  since  there 
are  multiple  wavelets.  In  any  event,  both  wavelet-based  algorithms  provide  relatively 
low-order  estimates  of  the  first-order  kernel.  In  comparison,  a simple  discrete  model 
of  an  8-second  first-order  kernel  using  input/output  data  that  has  been  sampled  at 
256  Hz  would  require  a total  of  2048  discrete  coefficients. 

As  a final  note,  in  this  example,  it  is  straightforward  to  select  the  optimal  kernel 
length  since  the  first-order  kernel  has  a known  analytical  form.  In  practice,  this 
information  is  usually  not  available.  Therefore,  it  is  of  interest  to  consider  what 
happens  if  a different  kernel  length  is  chosen.  First,  suppose  that  the  kernel  length  is 
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selected  to  be  16  seconds,  the  full  duration  of  the  data  set.  Figure  (7.12)  shows  the 
resulting  identified  kernel  in  terms  of  piecewise-quadratic  multiwavelets  at  resolution 
level  3,  corresponding  to  257  nonzero  terms.  Clearly,  there  is  no  harm  in  choosing 
a kernel  length  that  is  longer  than  necessary  other  than  the  fact  that  computational 
effort  is  needlessly  spent  identifying  many  zero-valued  coefficients.  Now,  consider 
the  case  in  which  the  kernel  length  is  selected  to  be  4 seconds.  A level  3 (33  term) 
piecewise-quadratic  multiwavelet  identification  of  the  kernel  is  shown  in  Figure  (7.13) 
in  comparison  to  the  first  4 seconds  of  the  analytical  kernel.  The  identified  kernel 
matches  the  analytical  kernel  well  due  to  the  fact  that  very  little  is  lost  when  the  kernel 
is  truncated  at  4 seconds.  The  resulting  identified  output,  shown  in  Figure  (7.14), 
has  a small  amount  of  error  due  to  the  truncation  of  the  kernel.  Finally,  consider 
the  effect  of  choosing  a kernel  length  of  2 seconds.  In  this  case,  a significant  portion 
of  the  kernel  has  been  lost.  The  piecewise-quadratic  identification  of  the  kernel  at 
resolution  level  3 (17  terms),  shown  in  Figure  (7.15),  still  matches  the  analytical 
kernel  fairly  well,  except  at  the  end  where  it  begins  to  diverge.  The  identified  output, 
depicted  in  Figure  (7.16),  shows  significant  error,  however,  due  to  the  truncation  of 
the  kernel.  This  error  does  not  appear  until  a time  of  about  2 seconds,  corresponding 
to  the  time  when  the  truncated  part  of  the  kernel  starts  to  have  an  influence  on  the 
output.  It  is  interesting  to  note  that  increasing  the  number  of  nonzero  terms  does 
not  improve  the  kernel  estimate  in  this  case,  but  rather  results  in  a less  stable  kernel. 
This  can  be  attributed  to  the  fact  that  the  identification  is  attempting  to  compensate 
for  the  missing  part  of  the  kernel  in  fitting  the  output.  As  a practical  point,  the 
mere  fact  that  the  identified  2-second  kernel  has  not  decayed  to  zero,  as  well  as  the 
significant  error  in  the  identified  output,  is  a clear  indication  that  a longer  kernel 
length  is  needed. 
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FIRST-ORDER  KERNEL 


iecewise-Quadratic  Identification  of  the  First-Order  Kernel  (16  sec. 


FIRST-ORDER  KERNEL 


Figure  7.13:  Piecewise-Quadratic  Identification  of  the  First-Order  Kernel  (4  sec.) 
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7.14:  Predicted  Output  from  the  Identified  4-Second  Kernel 


FIRST-ORDER  KERNEL 


Figure  7.15:  Piecewise-Quadratic  Identification  of  the  First-Order  Kernel  (2  sec.) 
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OUTPUT 


Figure  7.16:  Predicted  Output  from  the  Identified  2-Second  Kernel 
7.2  Quadratic  Nonlinear  System 

In  this  section,  a system  with  a quadratic  nonlinearity  is  considered.  Because  the 
system  dynamics  are  completely  characterized  by  the  second-order  Volterra  kernel,  it 
serves  as  a good  validation  case  for  the  second-order  kernel  identification  algorithms. 
In  the  literature  [15],  [16],  it  has  been  demonstrated  that  the  system  shown  in  Figure 
(7.17)  is  a quadratic  nonlinear  system.  This  system  is  composed  of  a linear  block 
with  first-order  kernel  hx  followed  by  a block  that  squares  the  output  v from  the  first 
block.  The  output  y of  this  system  is  given  as 


y(t) 


where  u is  the  input, 
second-order  Volterra 


= v\t)  = 


\ r f 

/ hi(£)u(t  - £)d£ 
.Jo 


= [ fii(0 u(t-€)d£  [ 

Jo  Jo 

= [ [ hi(£)hi{ri)u(t  - Z)u(t - 

Jo  Jo 


h\{rf)u{t  - y)dy 


V)d£dr] 


(7.11) 


Equation  (7.11)  is  of  the  same  form  as  the  output  of  the 
operator 
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Figure  7.17:  Quadratic  Nonlinear  System 


V2 (t)=  f [ h2(£,r))u{t-£)u(t-ri)dZdri  (7.12) 

Jo  Jo 

where  the  second-order  kernel  /z2  is  given  by 

V)  = hi(Ohi(v)  (7-13) 

Therefore,  it  is  clear  that  the  system  in  Figure  (7.17)  is  indeed  a quadratic  nonlinear 
system. 

In  this  example,  the  linear  subsystem  is  taken  to  have  the  same  form  as  the  linear 
oscillator  that  is  analyzed  in  Section  7.1.  Recall  that  the  first-order  kernel  h\  for  such 
a system  is  given  by 


h\(t)  = — — e <’u'nt  sin(t odt)  (7-14) 

mu)d 

The  second-order  kernel,  from  equation  (7.13),  then  takes  the  form 

*2(f>»?)  = f— — ) e“c‘JnK+r?)  sin(cj^)  sin{u drj)  (7.15) 

Clearly,  this  kernel  is  in  symmetric  form  since  /i2(?7,£)  = h2(£,?7).  In  this  example, 
the  parameters  for  the  linear  subsystem  are  taken  as 


= 2 kg 


m 


260 


FIRST-ORDER  KERNEL 


Figure  7.18:  Analytical  First-Order  Kernel  for  the  Linear  Subsystem 

c = 35  N-m/s  (7.16) 

k = 327 r2  N/m 

Then,  the  corresponding  damping  ratio,  natural  frequency,  and  damped  natural  fre- 
quency have  the  following  values: 


c = 

.696 

^ n = 

12.6 

rad/s  = 2 Hz 

= 

9.02 

rad/s  = 1.44  Hz 

(7.17) 


The  analytical  first-order  kernel  for  the  linear  subsystem  is  shown  in  Figure  (7.18). 
The  kernel  decays  to  zero  in  about  1 second,  implying  that  the  size  of  the  second- 
order  kernel  can  be  set  to  1 second  in  the  identification.  The  analytical  second-order 
kernel  from  equation  (7.15)  is  depicted  in  Figure  (7.19). 

The  wavelet-based  kernel  identification  algorithms  are  used  to  extract  the  second- 
order  kernel  from  input/output  data  from  this  system.  The  triangular  wavelets  are 
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Figure  7.19:  Analytical  Second-Order  Kernel 

used  to  identify  the  triangular  form  of  the  kernel  while  tensor  product  multiwavelets 
are  implemented  in  the  identification  of  the  symmetric  form  of  the  kernel.  The  input 
is  chosen  to  be  a linear  chirp  from  0 to  2 Hz  over  a 16-second  time  range.  A linear 
chirp  is  simply  a sinusoidal  signal  in  which  the  frequency  is  varied  in  a linear  fash- 
ion. The  amplitude  A of  the  input  force  is  chosen  as  10  N.  This  input  signal  and 
the  corresponding  output  are  shown  in  Figures  (7.20)  and  (7.21)  respectively.  The 
input/output  data  is  sampled  at  256  Hz,  so  the  resolution  level  is  8 for  the  zero-order 
hold  approximation  of  the  input  and  output.  The  triangular  wavelet  identification  of 
the  second-order  kernel  is  shown  in  Figure  (7.22)  for  varying  resolution  levels.  The 
identified  kernels  are  in  triangular  form  which,  as  discussed  in  Section  2.3,  can  be 
related  to  the  symmetric  kernel  h2}Sym  by  the  expression  [15] 

h2,tri{t„  v)  = 2fi2,Sj ,m  (£,  r\)  <LX  (rj  - £)  (7.18) 

where  the  unit  step  function  is  defined  as 
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INPUT 


Figure  7.20:  Chirp  Input  Excitation  (0-2  Hz) 


<5-i  {v~0  ■= 


(7.19) 


1 £ <V 

0 £ > TJ 

Simply  put,  the  triangular  kernel  is  equal  to  double  the  symmetric  kernel  inside  the 
triangular  domain  bounded  by  0 < £ < 0 < 77  < 1 and  zero  outside  of  the 

domain.  Figures  (7.23)  through  (7.25)  depict  the  identified  kernels  using  piecewise- 
linear,  quadratic,  and  cubic  multiwavelets  respectively.  Also,  the  analytical  kernel 
is  shown  for  comparison.  It  should  be  noted  that,  because  the  multiwavelet  kernels 
are  symmetric,  the  number  of  unique  terms  that  specify  the  kernels  is  less  than  the 
number  indicated  in  the  figures.  In  general,  then,  N2  nonzero  terms  corresponds  to 
\(N%  + N2)  unique  terms  due  to  the  symmetry.  The  output  corresponding  to  the 
identified  triangular  kernels  is  plotted  in  Figure  (7.26)  while  the  output  predicted  by 
the  symmetric  kernels,  identified  in  terms  of  the  multi  wavelets,  is  shown  in  Figures 
(7.27)  through  (7.29).  For  clarity  only  the  first  8 seconds  of  the  output  have  been 
plotted.  The  identified  output  is  depicted  by  a solid  line  while  the  actual  simulated 
output  is  represented  by  a dashed  line. 
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OUTPUT 


Figure  7.21:  Simulated  Output 


In  general,  the  identified  second-order  kernels  converge  to  the  analytical  kernel  as 
the  number  of  resolution  levels  is  increased.  It  is  immediately  apparent,  however,  that 
the  results  do  not  show  an  exact  convergence  of  the  identified  kernels.  This  can  be 
attributed  to  the  fact  that  the  second-order  model  in  both  wavelet  implementations 
is  more  ill-conditioned  than  the  first-order  model.  Despite  regularizing  the  problem 
using  the  truncated  singular  value  decomposition,  exact  results  are  difficult  to  obtain. 
Nevertheless,  the  identified  kernels  clearly  have  the  proper  form  and  converge  to  the 
correct  amplitude.  The  triangular  wavelet  results  are  somewhat  more  difficult  to 
evaluate  than  the  multiwavelet  results.  From  Figure  (7.22),  the  triangular  kernels 
appear  to  converge  to  a structure  that  is  similar  in  form  to  the  analytical  kernel 
with  twice  the  amplitude.  The  finest-resolution  kernel  is  still  relatively  coarse,  but 
due  to  computational  limits  in  MATLAB,  it  is  currently  not  possible  to  increase  the 
resolution  any  further.  The  identified  output  from  the  triangular  kernels,  shown  in 
Figure  (7.26),  converges  well  to  the  actual  output.  This  implies  that  more  resolution 
is  not  necessary  to  gain  a useful  identification  of  the  triangular  kernel.  The  kernels 
identified  using  the  piecewise- linear,  quadratic,  and  cubic  multiwavelets  all  exhibit 
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SECOND-ORDER  TRIANGULAR  KERNEL 


SECOND-OROER  TRIANGULAR  KERNEL 


x 10" 


TIME  (!) 


Level  0 Identification  (1  Term) 


Level  1 Identification  (4  Terms) 


Level  2 Identification  (16  Terms)  Level  3 Identification  (64  Terms) 


SECOND-ORDER  TRIANGULAR  KERNEL 


TIME  (s) 


SECOND-ORDER  TRIANGULAR  KERNEL 


Level  4 Identification  (256  Terms)  Level  5 Identification  (1024  Terms) 


Figure  7.22:  Triangular  Wavelet  Identification  of  the  Second-Order  Volterra  Kernel 
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Analytical  Second-Order  Kernel 


SECOND-ORDER  SYMMETRIC  KERNEL 


TWE  (s) 


Level  0 Identification  (25  Terms) 


SECOND-ORDER  SYMMETRIC  KERNEL 


Level  1 Identification  (81  Terms) 


SECOND-ORDER  SYMMETRIC  KERNEL 


Level  2 Identification  (289  Terms) 


SECOND-ORDER  SYMMETRIC  KERNEL 


Level  3 Identification  (1089  Terms) 


Figure  7.23:  Piecewise-Linear  Multiwavelet  Identification  of  the  Second-Order 

Volterra  Kernel 
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SECOND-ORDER  SYMMETRC  KERNEL 


Level  0 Identification  (25  Terms)  Level  1 Identification  (81  Terms) 


Level  2 Identification  (289  Terms)  Level  3 Identification  (1089  Terms) 

Figure  7.24:  Piecewise-Quadratic  Multiwavelet  Identification  of  the  Second-Order 
Volterra  Kernel 
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Figure  7.25:  Piecewise-Cubic  Multiwavelet  Identification  of  the  Second-Order 

Volterra  Kernel 
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IDENTIFIED  OUTPUT 


IDENTIFIED  OUTPUT 


Level  0 Identification  (1  Term)  Level  1 Identification  (4  Terms) 


IDENTIFIED  OUTPUT 


IDENTIFIED  OUTPUT 


Level  2 Identification  (16  Terms)  Level  3 Identification  (64  Terms) 


IDENTIFIED  OUTPUT 


IDENTIFIED  OUTPUT 


Level  4 Identification  (256  Terms)  Level  5 Identification  (1024  Terms) 


Figure  7.26:  Predicted  Output  from  the  Identified  Triangular  Kernels 
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IDENTIFIED  OUTPUT  x 1Q-*  IDENTIFIED  OUTPUT 


Level  0 Identification  (25  Terms)  Level  1 Identification  (81  Terms) 


IDENTIFIED  OUTPUT  „ 10-*  IDENTIFIED  OUTPUT 


Level  2 Identification  (289  Terms)  Level  3 Identification  (1089  Terms) 


Figure  7.27:  Predicted  Output  from  the  Piecewise-Linear  Multiwavelet  Identification 
of  the  Second-Order  Kernel 
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IDENTIFIED  OUTPUT  x 10-*  IDENTIFIED  OUTPUT 


Level  0 Identification  (25  Terms)  Level  1 Identification  (81  Terms) 


IDENTIFIED  OUTPUT  „ 10-*  IDENTIFIED  OUTPUT 


Level  2 Identification  (289  Terms)  Level  3 Identification  (1089  Terms) 

Figure  7.28:  Predicted  Output  from  the  Piecewise-Quadratic  Multiwavelet  Identifi- 
cation of  the  Second-Order  Kernel 
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IDENTIFIED  OUTPUT  „ 10-*  IDENTIFIED  OUTPUT 


Level  0 Identification  (36  Terms)  Level  1 Identification  (121  Terms) 


IDENTIFIED  OUTPUT  „ 1Q-  IDENTIFIED  OUTPUT 


Level  2 Identification  (441  Terms)  Level  3 Identification  (1681  Terms) 

Figure  7.29:  Predicted  Output  from  the  Piecewise-Cubic  Multiwavelet  Identification 
of  the  Second-Order  Kernel 
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similar  behavior.  In  all  cases,  the  identified  kernels  converge  to  the  proper  amplitude 
and  form.  However,  in  all  three  cases,  as  the  resolution  level  is  increased  to  3,  the 
identified  kernels  actually  become  less  accurate  than  the  level  2 kernels.  As  mentioned 
before,  this  is  a result  of  the  ill-conditioned  nature  of  the  second-order  Volterra  model. 
As  the  number  of  coefficients  in  the  model  is  increased,  the  solution  becomes  less 
stable.  This  is  characteristic  behavior  for  an  ill-posed  inverse  problem.  Of  course,  the 
input  has  an  influence  on  the  model  as  well.  Increasing  the  number  of  data  points  and 
using  a different  input  could  potentially  improve  the  results.  At  any  rate,  as  shown 
in  Figures  (7.27)  through  (7.29),  the  identified  kernels  are  capable  of  reproducing  the 
simulated  output  of  the  system.  In  conclusion,  the  results  show  that  both  wavelet 
implementations  are  capable  of  extracting  reasonable  estimates  of  the  second-order 
kernel  for  this  nonlinear  system.  These  estimates  are  in  terms  of  a relatively  small 
number  of  coefficients,  especially  when  compared  to  the  fact  that  a simple  discrete 
model  of  the  same  second-order  kernel  (of  1-second  duration)  would  require  (256)2  or 
65,  536  terms. 


7.3  Nonlinear  Oscillator 

In  previous  sections,  the  kernel  identification  algorithms  have  been  validated  for 
a linear  oscillator  and  a prototypical  quadratic  nonlinear  system.  It  has  been  demon- 
strated that  both  wavelet  implementations  are  capable  of  identifying  the  analytical 
Volterra  kernels  for  these  simple  cases  for  which  it  is  known  that  the  system  dynam- 
ics can  be  completely  characterized  in  terms  of  the  first  and  second-order  kernels, 
respectively.  In  this  section,  a more  general  case  is  considered  in  which  both  the  first 
and  second-order  kernels  are  needed  to  describe  the  system  dynamics.  In  addition, 
the  higher-order  kernels  are  nonzero  for  this  nonlinear  system.  This  example  lends 
insight  into  the  ability  of  the  wavelet-based  algorithms  to  extract  Volterra  kernels  for 
systems  with  polynomial  nonlinearities. 
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The  system  analyzed  in  this  example  is  a nonlinear  oscillator  which  is  described 
by  the  following  equation  of  motion: 

my(t)  + cy(t)  + kxy(t)  + k2y2(t)  = u{t)  (7.20) 

Equation  (7.20)  takes  the  same  form  as  the  linear  oscillator  except  that  a quadratic 
stiffness  term  has  now  been  introduced.  The  system  parameters  are  chosen  as 


m — 1 kg 

c — 6 N • m/s  (7.21) 

kx  = 47 r2  N/m 
k2  = 47T2  N/m 

In  this  case,  the  linear  dynamics  of  the  system  are  not  affected  by  the  nonlinear 
stiffness  term.  Therefore,  the  first-order  kernel  takes  the  same  form  as  in  the  linear 
oscillator  case: 


hx  (t)  = — e~(Unt  sin  (udt)  (7.22) 

mLUd 

For  the  values  chosen  in  equation  (7.21),  the  underlying  linear  system  has  the  follow- 
ing damping  ratio,  natural  frequency,  and  damped  natural  frequency: 


c = .478 

un  = 2tx  rad/s  = 1 Hz  (7.23) 

u>d  = 5.52  rad/s  = .879  Hz 

The  analytical  first-order  kernel  is  depicted  in  Figure  (7.30).  The  nonlinear  dynamics 
are  characterized  in  terms  of  Volterra  kernels  of  second  order  and  higher.  It  should  be 


274 


FIRST-ORDER  KERNEL 


Figure  7.30:  Analytical  First-Order  Kernel 


noted  that,  although  the  system  in  equation  (7.20)  exhibits  a quadratic  nonlinearity, 
that  does  not  imply  that  the  nonlinearity  can  be  completely  described  by  the  second- 
order  kernel.  Indeed,  as  discussed  in  Section  2.5,  it  is  possible  to  derive  analytical 
forms  of  Volterra  kernels  from  analytic  ODEs.  It  is  not  difficult  to  show  that  the 
kernels  of  third  order  and  higher  do  not  vanish  for  this  system.  This  naturally  raises 
the  question  of  whether  or  not  the  second-order  kernel  is  adequate  to  characterize  the 
nonlinear  dynamics.  This  example  investigates  under  what  circumstances  an  accurate 
characterization  of  the  nonlinear  output  is  possible  while  neglecting  the  effect  of  the 
kernels  of  third  order  and  higher. 

The  first  and  second-order  kernels  are  extracted  from  input/output  data  from 
the  nonlinear  oscillator  using  the  two  wavelet-based  algorithms.  In  this  example,  only 
the  piecewise-quadratic  multiwavelets  are  used  in  the  multiwavelet  implementation 
since  the  piecewise-linear  and  cubic  multiwavelets  tend  to  give  similar  results.  The 
system  input,  shown  in  Figure  (7.31),  is  chosen  to  be  a linear  chirp  with  a frequency 
range  of  0 to  2 Hz  over  32  seconds  and  an  amplitude  of  2 N.  The  response  of  the 
system  to  this  input  is  simulated  in  MATLAB  and  plotted  in  Figure  (7.32).  The 
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input/output  data  is  sampled  at  a rate  of  128  Hz,  corresponding  to  a resolution  level 
of  7.  First  and  second-order  kernels  are  extracted  from  the  input/output  data  using 
the  piecewise-quadratic  multiwavelets.  These  kernels  have  been  truncated  after  4 
seconds  and  are  depicted  in  Figures  (7.33)  and  (7.34).  In  Figure  (7.33),  the  identified 
first-order  kernel  at  resolution  level  4 (257  nonzero  terms)  is  denoted  by  a solid  line 
and  the  analytical  first-order  kernel  is  represented  by  a dashed  line.  The  second- 
order  kernel  is  identified  at  resolution  level  0,  corresponding  to  289  nonzero  terms. 
The  plot  of  the  second-order  kernel  indicates  that  it  can  be  further  truncated  to  2 
seconds.  Using  this  information,  second-order  kernels  with  2-second  time  duration 
are  identified.  The  extracted  kernels  at  resolution  levels  1 (289  terms)  and  2 (1089 
nonzero  terms)  are  plotted  in  Figures  (7.35)  and  (7.36)  respectively.  Similarly,  first 
and  second-order  kernels  are  identified  using  the  triangular  wavelet  algorithm.  A 
Haar  wavelet  identification  of  the  first-order  kernel  at  resolution  level  6 (256  terms) 
is  depicted  in  Figure  (7.37)  and  an  extracted  triangular  kernel  at  resolution  level  5 
(1024  terms)  is  shown  in  Figure  (7.38).  The  identified  first-order  kernels  match  the 
analytical  kernel  relatively  well  with  a small  degree  of  error.  The  multi  wavelet  and 
triangular  second-order  kernels  are  in  good  agreement  with  each  other.  The  triangular 
kernel  should  have  double  the  amplitude  of  the  symmetric  kernel  over  the  triangular 
domain.  Figures  (7.35)  and  (7.38)  demonstrate  that  this  is  indeed  the  case. 

The  total  output  predicted  by  the  identified  multiwavelet  kernels  is  plotted  in 
Figure  (7.39).  Since  there  is  not  much  difference  between  the  second-order  kernel 
identified  at  resolution  level  1 (289  nonzero  terms)  in  Figure  (7.35)  and  the  one  iden- 
tified at  resolution  level  2 (1089  nonzero  terms)  in  Figure  (7.36),  the  lower-resolution 
kernel  is  used  to  predict  the  second-order  output.  The  predicted  first-order  (linear) 
and  second-order  (nonlinear)  outputs  are  shown  in  Figures  (7.40)  and  (7.41)  respec- 
tively. Since  the  underlying  linear  system  is  known,  it  is  possible  to  simulate  the 
linear  output  for  comparison  with  the  output  predicted  by  the  identified  first-order 
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INPUT 


Figure  7.31:  Chirp  Input  (0-2  Hz) 


OUTPUT 


Figure  7.32:  Simulated  Response  to  Chirp  Input 


277 


FIRST-ORDER  KERNEL 


Figure  7.33:  Level  4 (257  Term)  Identification  of  the  First-Order  Kernel  Using 
Piecewise-Quadratic  Multiwavelets 


SECOND-ORDER  SYMMETRIC  KERNEL 


Figure  7.34:  Level  0 (289  Term)  Identification  of  the  Second-Order  Kernel  Using 
Piecewise-Quadratic  Multiwavelets 
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Figure  7.35:  Level  1 (289  Term)  Identification  of  the  Second-Order  Kernel  Using 
Piecewise-Quadratic  Multiwavelets 
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Figure  7.36:  Level  2 (1089  Term)  Identification  of  the  Second-Order  Kernel  Using 
P iecewise- Quadratic  M ulti wavelets 
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FIRST-ORDER  KERNEL 


Figure  7.37:  Level  6 (256  Term)  Identification  of  the  First-Order  Kernel  Using  Haar 
Wavelets 
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Figure  7.38:  Level  5 (1024  Term)  Identification  of  the  Second-Order  Kernel  Using 
Tr  iangular  Wavelets 
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TOTAL  OUTPUT 


Figure  7.39:  Total  Output  Predicted  By  the  Identified  Multiwavelet  Kernels 

kernel.  The  nonlinear  part  of  the  output  is  simply  the  difference  between  the  total 
simulated  output  and  the  simulated  linear  output.  Then,  it  is  possible  to  evaluate 
how  well  the  predicted  second-order  output  approximates  the  nonlinear  portion  of  the 
simulated  output.  As  can  be  seen  in  Figure  (7.41),  the  nonlinear  part  of  the  output 
for  this  case  is  at  least  an  order  of  magnitude  smaller  than  the  linear  output.  It  should 
be  noted  that  the  nonlinear  dynamics  are  negative.  Therefore,  the  structure  of  the 
identified  second-order  kernels  is  physically  reasonable  since  the  dominant  feature  of 
the  kernels  is  a large  negative  hump.  The  total  output  predicted  by  the  first  and 
second-order  kernels  extracted  using  the  triangular  wavelet  implementation  is  shown 
in  Figure  (7.42).  The  predicted  first-order  and  second-order  outputs  are  displayed 
in  Figures  (7.43)  and  (7.44)  respectively.  As  usual,  in  all  the  figures,  the  solid  line 
denotes  the  identified  output  while  the  dashed  line  represents  the  simulated  output. 

In  both  wavelet-based  kernel  identification  algorithms,  the  ability  of  the  extracted 
kernels  to  accurately  predict  the  output  is  not  always  a guarantee  that  the  kernels 
are  accurate.  Because  the  kernel  coefficients  are  trained  to  faithfully  reproduce  the 
output,  there  is  always  the  danger  that  the  identified  kernels  correspond  to  a mere 
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LINEAR  OUTPUT 


Figure  7.40:  First-Order  (Linear)  Output  Predicted  By  the  Identified  First-Order 
Multiwavelet  Kernel 


NONLINEAR  OUTPUT 


Figure  7.41:  Second-Order  (Nonlinear)  Output  Predicted  By  the  Identified  Second- 
Order  Multiwavelet  Kernel 
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TOTAL  OUTPUT 


Figure  7.42:  Total  Output  Predicted  By  the  Identified  Haar  and  Triangular  Wavelet 
Kernels 


LINEAR  OUTPUT 


Figure  7.43:  First-Order  (Linear)  Output  Predicted  By  the  Identified  First-Order 
Haar  Wavelet  Kernel 
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NONLINEAR  OUTPUT 


Figure  7.44:  Second-Order  (Nonlinear)  Output  Predicted  By  the  Identified  Second- 
Order  Triangular  Kernel 

a curve-fit  of  the  specific  data  set  from  which  they  are  extracted.  In  the  examples 
considered  thus  far,  analytical  kernels  have  been  available  to  serve  as  a check  on 
the  accuracy  of  the  identified  kernels.  In  practice,  however,  one  does  not  usually 
have  analytical  forms  of  the  Volterra  kernels.  Instead,  the  kernels  can  be  validated 
by  evaluating  their  ability  to  predict  the  outputs  corresponding  to  different  input 
data  sets.  In  this  example,  although  the  analytical  first-order  kernel  is  known,  and 
it  is  possible  to  derive  the  analytical  second-order  kernel  from  the  ODE  in  equation 
(7.20),  the  identified  kernels  are  tested  on  a number  of  different  input/output  data 
sets.  This  analysis  is  performed  for  the  piecewise-quadratic  multiwavelet  kernels 
depicted  in  Figures  (7.33)  and  (7.35).  First,  an  input  composed  of  two  sinusoids  with 
frequencies  of  .5  Hz  and  4 Hz  is  considered.  This  input,  depicted  in  Figure  (7.45), 
takes  the  form 


Ui(t)  = Ai  sin(7rf)  + A2  sin(87r t) 


(7.24) 
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Figure  7.45:  Input  u\  for  Kernel  Validation 


where  the  amplitudes  are  selected  as  A\  = 1 N and  A2  = .25  N.  Although  the  total 
duration  of  the  input  data  set  is  32  seconds,  only  the  first  16  seconds  are  plotted 
for  clarity.  The  predicted  first-order  and  second-order  outputs  from  the  identified 
kernels  are  compared  with  the  linear  and  nonlinear  portions  of  the  simulated  output 
in  Figures  (7.46)  and  (7.47).  Once  again,  only  the  first  16  seconds  of  output  are  shown. 
Since  the  nonlinear  part  of  the  output  is  very  small  compared  to  the  linear  part,  the 
total  output  differs  little  from  the  linear  output  and  is  not  plotted  in  this  case.  The 
results  show  that  the  first-order  kernel  accurately  predicts  the  linear  dynamics.  This 
is  not  surprising  since  the  identified  kernel  matches  the  analytical  kernel  relatively 
well.  The  identified  second-order  kernel  is  able  to  predict  the  nonlinear  dynamics 
with  relatively  small  error. 

Next,  consider  an  input  of  the  form 


u2(t) 


Ai  sin  + A2  sin  (7 rt)  + A3  sin  (2irt) 

+A4  sin  (47 rf)  + A5  sin  (167rf) 


(7.25) 
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LINEAR  OUTPUT 


3 7.46:  Predicted  First-Order  (Linear)  Output  for  Input  ui 


NONLINEAR  OUTPUT 


Figure  7.47:  Predicted  Second-Order  (Nonlinear)  Output  for  Input  u\ 
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INPUT 


Figure  7.48:  Input  u2  for  Kernel  Validation 


which  is  composed  of  sinusoids  with  frequencies  of  {1/32,  .5, 1,  2,  8}  Hz.  The  ampli- 
tudes {Hi,  A2,  H3,  A i,  A5}  are  chosen  as  {2, 1,  .5,  .25, 1}  N.  This  input  is  displayed  in 
Figure  (7.48).  The  predicted  first  and  second-order  outputs  are  shown  in  Figures 
(7.49)  and  (7.50).  Once  again,  the  identified  first-order  kernel  predicts  the  linear 
dynamics  with  a high  degree  of  accuracy.  The  identified  second-order  kernel  cap- 
tures the  nonlinear  dynamics  relatively  well.  The  difference  between  the  predicted 
and  actual  nonlinear  dynamics  can  be  attributed  in  part  to  error  in  the  identified 
kernel  and,  more  significantly,  the  truncation  error  resulting  from  neglecting  higher- 
order  kernels.  The  total  output  predicted  by  the  identified  kernels  is  compared  to  the 
simulated  output  in  Figure  (7.51). 

Next,  the  amplitude  Ai  in  the  input  signal  U\  is  changed  from  1 to  5.  The 
resulting  input  u3  is  depicted  in  Figure  (7.52).  The  increased  amplitude  of  the  input 
has  the  effect  of  increasing  the  influence  of  the  nonlinearity  on  the  overall  output. 
The  predicted  first  and  second-order  outputs  are  shown  in  Figures  (7.53)  and  (7.54) 
respectively.  The  nonlinear  part  of  the  output  is  now  a factor  of  10  smaller  than 
the  linear  part.  In  contrast,  for  the  case  where  Ai  = 1,  that  ratio  is  greater  than  30. 
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LINEAR  OUTPUT 


3 7.49:  Predicted  First-Order  (Linear)  Output  for  Input  u2 


NONLINEAR  OUTPUT 


TIME  (s) 


Figure  7.50:  Predicted  Second-Order  (Nonlinear)  Output  for  Input  u2 
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TOTAL  OUTPUT 


Figure  7.51:  Total  Predicted  Output  for  Input  u2 


Thus,  in  this  case,  the  nonlinearity  exerts  a greater  influence  than  before.  From  Figure 
(7.53),  the  first-order  kernel  accurately  predicts  the  linear  dynamics.  The  predicted 
second-order  output,  however,  shows  significant  amplitude  error  in  capturing  the 
nonlinear  oscillations.  This  is  due  to  the  fact  that  the  increase  in  the  amplitude  of 
the  input  has  excited  the  higher-order  nonlinear  dynamics  that  have  been  neglected 
in  the  model.  Nevertheless,  the  total  output  predicted  by  the  identified  kernels  is  still 
very  accurate  as  can  be  seen  in  Figure  (7.55). 

As  a final  example,  the  effect  of  increasing  the  input  amplitude  is  investigated 
further  by  changing  the  amplitude  A\  in  the  input  signal  u2  from  2 to  10.  The  resulting 
input  U4  is  shown  in  Figure  (7.56).  The  predicted  first  and  second-order  outputs  are 
plotted  in  Figures  (7.57)  and  (7.58)  respectively.  The  nonlinear  dynamics  are  now 
of  the  same  order  of  magnitude  as  the  linear  dynamics.  As  can  be  seen  in  Figure 
(7.58),  the  identified  second-order  kernel  is  not  capable  of  accurately  predicting  the 
nonlinear  dynamics.  This  can  be  attributed  to  the  fact  that  as  the  input  amplitude 
is  increased,  the  higher-order  Volterra  operators  have  an  increased  influence  on  the 
nonlinear  output.  The  truncation  error  due  to  neglecting  the  higher-order  kernels  is 
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INPUT 


Figure  7.52:  Input  uj,  for  Kernel  Validation 


LINEAR  OUTPUT 


Figure  7.53:  Predicted  First-Order  (Linear)  Output  for  Input  u3 
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NONLINEAR  OUTPUT 


TIME  (s) 


.54:  Predicted  Second-Order  (Nonlinear)  Output  for  Input  u3 


TOTAL  OUTPUT 


Figure  7.55:  Total  Predicted  Output  for  Input  u3 
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INPUT 


Figure  7.56:  Input  U4  for  Kernel  Validation 


magnified  and  the  second-order  kernel  alone  is  no  longer  capable  of  characterizing 
the  nonlinear  dynamics.  These  validation  cases  show  that  the  nonlinear  dynamics  of 
the  system  are  well  characterized  by  the  identified  second-order  kernel  for  a range  of 
input  amplitudes.  As  the  input  amplitude  grows  larger,  however,  higher-order  kernels 
have  a greater  influence  and  can  no  longer  be  neglected.  These  results  are  consistent 
with  the  convergence  discussion  given  in  Section  2.5,  where  it  was  noted  that  the 
convergence  of  a truncated  Volterra  series  is  guaranteed  only  for  a bounded  range  of 
input  amplitudes. 


7.4  Flutter  Analysis 

As  a final  example,  Volterra  kernels  are  identified  from  flight  test  data  from  the 
Aerostructures  Test  Wing  (ATW),  which  was  developed  at  NASA  Dryden  to  study 
flutter.  Flutter  is  a linear  aeroelastic  instability  that  results  from  the  unfavorable 
interaction  of  aerodynamic  forces  with  the  wing  structure.  In  classical  binary  flutter, 
two  modes  of  vibration,  often  the  bending  and  torsion  modes,  become  coupled  as 
the  air  speed  approaches  the  flutter  speed.  This  coupling  causes  the  damping  ratio 
of  one  of  the  modes  to  increase  while  the  damping  ratio  corresponding  to  the  other 
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LINEAR  OUTPUT 


3 7.57:  Predicted  First-Order  (Linear)  Output  for  Input  u4 


NONLINEAR  OUTPUT 


Figure  7.58:  Predicted  Second-Order  (Nonlinear)  Output  for  Input  «4 
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TOTAL  OUTPUT 


Figure  7.59:  Total  Predicted  Output  for  Input  n4 


mode  approaches  zero.  At  the  critical  flutter  speed,  this  damping  ratio  reaches  zero 
and  any  aerodynamic  disturbance  results  in  an  unstable  oscillation  in  the  undamped 
mode.  Thus,  flutter  is  usually  catastrophic,  and  the  determination  of  flutter  speeds 
is  a critical  aspect  of  the  design  of  any  aircraft.  One  common  approach  to  flutter 
prediction  is  to  extrapolate  the  damping  ratios  from  flight  data  at  different  flight 
conditions  to  determine  the  critical  flutter  speed  [117].  Another  approach  is  to  employ 
the  concept  of  a flutter  margin  to  predict  the  flutter  speed  [118].  These  methods  have 
their  limitations,  however,  and  do  not  always  lead  to  conservative  flutter  estimates 
[119].  Recently,  Lind  and  Brenner  [6]  developed  the  flutterometer,  a flutter  prediction 
tool  that  computes  a flutter  speed  based  on  an  analytical  model.  The  computed  flutter 
speed  is  robust  with  respect  to  an  uncertainty  description  that  is  generated  from  flight 
data.  In  this  manner,  a worst-case  flutter  speed  is  predicted  that  is  conservative  with 
respect  to  the  true  flutter  speed. 

The  ATW  was  designed  at  the  NASA  Dryden  Flight  Research  Center  in  order 
to  test  the  flutterometer.  The  ATW,  pictured  in  Figure  (7.60),  consisted  of  a wing 
and  boom  assembly  that  was  attached  to  the  undercarriage  of  an  F-15  aircraft  as 
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Figure  7.60:  Aerostructures  Test  Wing 

shown  in  Figure  (7.61).  The  wing  span  of  the  ATW  was  18  in.  and  the  total  weight 
was  2.66  lbs.  The  ATW  was  designed  to  flutter  at  a speed  that  was  within  the  safe 
flight  envelope  of  the  host  F-15.  By  actually  driving  the  ATW  to  flutter,  the  true 
flutter  speed  of  the  ATW  was  experimentally  determined  for  comparison  with  the 
flutterometer  predictions.  The  wing  was  constructed  from  lightweight  materials  in 
order  to  guarantee  that  it  would  not  damage  the  F-15  once  it  broke  apart  during 
flutter.  The  ATW  was  instrumented  with  18  strain  gages  and  3 accelerometers. 
Piezoelectric  patches  were  mounted  on  the  upper  and  lower  surfaces  of  the  wing. 
Acting  in  tandem,  the  patches  acted  as  a single  actuater  that  provided  a linear  chirp 
excitation  from  5 to  35  Hz.  Flight  tests  of  the  ATW  were  performed  at  NASA  Dryden 
in  the  spring  of  2001.  The  interested  reader  can  refer  to  Ref.  [119]  for  more  details 
regarding  the  flight-testing  of  the  ATW. 

Flight  data  from  the  ATW  was  taken  at  a number  of  different  flight  conditions. 
In  this  example,  Volterra  kernels  are  extracted  from  data  corresponding  to  a flight 
condition  of  Mach  number  .8  and  altitude  20,  000  ft.  The  input  is  taken  as  the 
chirp  excitation  over  a frequency  range  of  5 to  35  Hz.  The  output  is  taken  as  the 
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Figure  7.61:  Mounting  of  the  Aerostructures  Test  Wing 


response  measured  by  an  accelerometer  that  was  mounted  on  the  trailing  edge  of 
the  boom.  The  measured  output  is  shown  in  Figure  (7.62).  The  dominant  modes  of 
the  system  are  an  18  Hz  bending  mode  and  a 24  Hz  torsion  mode.  The  response  of 
these  modes  can  be  visualized  in  Figure  (7.62)  in  that  an  increased  output  response 
is  evident  at  approximately  25  seconds  and  37  seconds.  These  times  correspond  to 
when  the  input  excitation  frequency  reaches  18  and  24  Hz,  respectively.  The  data 
is  sampled  at  a rate  of  800  Hz.  A zero-order  hold  approximation  of  the  input  and 
output  at  level  8,  corresponding  to  a sampling  frequency  of  256  Hz,  is  generated 
by  projecting  the  data  onto  Haar  scaling  functions  on  level  8.  First  and  second- 
order  Volterra  kernels  are  then  identified  from  the  input/output  data.  Varying  the 
kernel  length  in  the  identification,  it  has  been  determined  that  a time  duration  of 
1 second  is  sufficient  for  the  first-order  kernel.  A Haar  identification  of  the  first- 
order  kernel  at  a resolution  level  of  8 (256  nonzero  terms)  is  shown  in  Figure  (7.63). 
Similarly,  a piecewise-quadratic  multiwavelet  identification  of  the  kernel  at  resolution 
level  6 (257  nonzero  terms)  is  shown  in  Figure  (7.64).  The  two  identified  kernels 
are  very  similar  in  form.  Viewed  in  the  frequency  domain,  both  extracted  kernels 
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show  the  dominant  modes  at  18  Hz  and  24  Hz,  as  shown  in  the  frequency  domain 
plots  of  the  kernels  in  Figures  (7.65)  and  (7.66).  Unfortunately,  attempts  to  extract  a 
physically  meaningful  second-order  kernel  have  proven  unsuccessful  in  both  wavelet- 
based  algorithms.  The  identified  second-order  kernels  show  no  coherent  structure  and 
fail  to  converge  as  the  number  of  nonzero  wavelet  coefficients  are  increased.  Thus,  it 
is  suspected  that  the  second-order  kernels  are  modeling  noise  in  the  output.  The  fact 
that  the  identified  second-order  kernels  are  much  smaller  in  magnitude  than  the  first- 
order  kernels,  coupled  with  the  fact  that  the  identified  first-order  kernels  resemble 
impulse  response  functions,  implies  that  the  ATW  was  essentially  a linear  system. 
This  is  in  agreement  with  a previous  analysis  of  the  ATW  that  demonstrated  that  it 
responded  linearly  to  a range  of  inputs  [120].  Because  the  two  identified  first-order 
kernels  are  so  similar,  the  predicted  linear  responses  are  essentially  the  same  and 
it  is  not  necessary  to  show  both  plots.  The  linear  response  predicted  by  the  Haar 
identification  of  the  first-order  kernel  is  shown  in  Figure  (7.67).  Qualitatively,  the 
predicted  output  appears  to  capture  the  response  of  the  two  dominant  modes  at  25 
seconds  and  37  seconds.  The  predicted  linear  output  does  not  identify  much  of  the 
noise  that  is  present  in  the  original  flight  data,  particularly  at  the  beginning  of  the 
data  set  where  the  response  is  almost  entirely  due  to  noise. 

A practical  application  of  the  identified  first-order  kernels  is  described  in  Ref.  [33] . 
In  this  paper,  the  kernels  are  incorporated  into  the  fiutterometer  in  order  to  estimate 
more  accurate  flutter  speeds  for  the  ATW.  The  original  fiutterometer  implementation 
computes  flutter  speeds  that  are  robust  with  respect  to  an  uncertainty  description 
that  is  generated  from  flight  data.  Basically,  a transfer  function  is  computed  from 
the  nominal  finite  element  model.  An  uncertainty  envelope  is  then  placed  around 
the  nominal  transfer  function  in  such  a manner  that  the  transfer  function  derived 
from  the  flight  data  is  covered.  This  is  demonstrated  in  Figure  (7.68).  The  un- 
certainty envelope  ensures  a conservative  flutter  estimate,  but  the  computed  flutter 
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Figure  7.62:  Accelerometer  Response  to  a Chirp  Excitation 


FIRST-ORDER  KERNEL 


Figure  7.63:  Haar  Identification  of  the  First-Order  Kernel  at  Resolution  Level  8 (256 
terms) 
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FIRST-ORDER  KERNEL 


Figure  7.64:  Piecewise- Quadratic  Multiwavelet  Identification  of  the  First-Order  Ker- 
nel at  Resolution  Level  6 (257  terms) 


FFT  OF  FIRST-ORDER  KERNEL 


Figure  7.65:  FFT  of  the  Haar  Identification  of  the  First-Order  Kernel 
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FFT  OF  FIRST-ORDER  KERNEL 


Figure  7.66:  FFT  of  the  Piecewise-Quadratic  Multiwavelet  Identification  of  the  First- 
Order  Kernel 


Figure  7.67:  Linear  Response  Predicted  by  the  Identified  First-Order  Kernel 
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speed  is  overly  conservative.  One  of  the  reasons  for  this  is  the  fact  that  in  the  cur- 
rent flutterometer  implementation,  the  flight  data  is  used  to  generate  the  uncertainty 
description.  Flutter  is  a linear  phenomenon  and  is  modeled  by  an  analytical  linear 
model.  The  flight  data,  however,  consists  of  linear  dynamics,  nonlinear  dynamics, 
and  noise.  Therefore,  an  uncertainty  description  based  on  the  differences  between 
the  flight  data  and  the  nominal  model  incorporates  nonlinear  dynamics  and  noise, 
which  do  not  contribute  to  the  linear  flutter  dynamics.  Using  the  first-order  kernel 
that  has  been  identified  from  the  flight  data,  it  is  possible  to  extract  the  linear  dy- 
namics. Then,  an  uncertainty  description  can  be  obtained  in  terms  of  the  differences 
between  the  analytical  model  and  the  identified  linear  dynamics.  This  leads  to  a 
more  accurate  characterization  of  the  uncertainty  in  the  linear  flutter  model.  This  is 
demonstrated  in  Figure  (7.69),  in  which  the  transfer  function  of  the  linear  dynamics 
predicted  from  the  Haar  identification  of  the  first-order  kernel  is  used  to  define  the 
uncertainty  envelope.  Comparing  Figures  (7.68)  and  (7.69),  the  uncertainty  envelope 
based  on  the  extracted  linear  dynamics  is  smaller  than  that  obtained  from  the  flight 
data.  This  implies  that  a less  conservative  flutter  speed  will  be  predicted  by  the  flut- 
terometer using  the  new  implementation.  This  is  indeed  the  case  as  the  flutterometer 
predicts  flutter  speeds  of  403  knots  of  equivalent  air  speed  (KEAS)  for  the  uncertainty 
description  based  on  the  flight  data  and  413  KEAS  for  the  uncertainty  description 
obtained  from  the  first-order  Volterra  kernel.  By  design,  the  flutterometer  always  es- 
timates a more  conservative  flutter  speed  than  the  nominal  model,  which  in  this  case 
predicts  a flutter  speed  of  431  KEAS.  The  true  flutter  speed  of  the  ATW,  determined 
in  flight  test  when  the  ATW  was  driven  to  flutter  and  destroyed,  is  460  KEAS.  Thus, 
the  first-order  Volterra  kernel  identification  can  be  used  to  improve  the  accuracy  of 
the  flutterometer  and,  in  many  cases,  reduce  some  of  the  inherent  conservatism  in 
the  resulting  flutter  estimates. 
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Figure  7.68:  Uncertainty  Description  Based  on  the  Flight  Data:  flight  data  ( ), 

nominal  model  (•••),  uncertainty  envelope  ( — ) 


Figure  7.69:  Uncertainty  Description  Based  on  the  First-Order  Volterra  Kernel:  iden- 
tified linear  dynamics  ( ),  nominal  model  (•••),  uncertainty  envelope  ( — ) 


CHAPTER  8 
CONCLUSION 


In  this  dissertation,  two  novel  wavelet-based  algorithms  have  been  developed 
for  the  identification  of  Volterra  kernels  for  nonlinear  dynamical  systems.  In  both 
cases,  the  Volterra  series  model  is  truncated  to  include  only  the  first  and  second- 
order  Volterra  operators.  The  first  implementation  employs  Haar  wavelets  for  the 
representation  of  the  first-order  kernel  and  triangular  wavelets  for  the  approxima- 
tion of  the  triangular  form  of  the  second-order  kernel.  These  triangular  wavelets 
have  been  constructed  using  the  method  developed  by  Micchelli  and  Xu  [4,  5]  for 
deriving  wavelets  over  finite  sets.  The  second  wavelet  implementation  uses  piecewise- 
polynomial  multiwavelets  to  approximate  the  first  and  second-order  kernels.  These 
multiwavelets  have  been  generated  from  classical  finite  element  basis  functions  using 
the  technique  of  successive  intertwinings  of  multiresolution  analyses,  developed  by 
Donovan,  Geronimo,  and  Hardin  [2,3].  This  method  has  been  demonstrated  for  the 
construction  of  piecewise-quadratic  and  piecewise-cubic  multiwavelets.  In  addition, 
it  has  been  shown  that  this  approach  can  be  generalized  to  construct  multiwavelets 
of  arbitrary  order  from  the  appropriate  finite  elements,  and  a theorem  has  been  given 
to  that  effect.  These  piecewise-polynomial  multiwavelets  are  easily  adapted  to  the 
finite  domains  over  which  the  kernels  are  supported.  In  this  case,  tensor  product 
multiwavelets  are  used  to  represent  the  symmetric  form  of  the  second-order  kernel. 

These  wavelet-based  algorithms  have  been  implemented  to  extract  Volterra  ker- 
nels from  input/output  data  from  a number  of  dynamical  systems.  The  results  show 
that  both  wavelet  approaches  were  able  to  successfully  identify  first-order  kernels 
for  the  linear  oscillator  case  and  second-order  kernels  for  the  prototypical  quadratic 
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nonlinear  system  in  Section  7.2.  In  these  examples,  relatively  low-order  estimates  of 
the  kernels  were  obtained.  The  identified  first-order  kernels  in  the  linear  oscillator 
case  converged  exactly  to  the  analytical  kernel.  In  the  quadratic  example,  however, 
the  identified  kernels  converged  but  showed  some  instability  as  the  number  of  coeffi- 
cients was  increased  in  the  model.  This  is  most  likely  due  to  the  ill-posed  nature  of 
the  second-order  identification.  This  was  reflected  in  the  singular  value  plots,  which 
typically  showed  a large  number  of  singular  values  that  were  essentially  zero. 

The  nonlinear  oscillator  example  demonstrated  the  ability  of  the  wavelet  algo- 
rithms to  simultaneously  identify  first  and  second-order  kernels  for  a system  with 
a polynomial  nonlinearity.  The  extracted  kernels  were  not  completely  smooth,  but 
clearly  converged  to  a specific  form.  The  identified  first-order  kernels  matched  the 
analytical  first-order  kernel  from  the  underlying  linear  system  relatively  well.  The 
identified  multiwavelet  and  triangular  second-order  kernels  were  in  agreement,  lend- 
ing confidence  in  the  results.  When  identifying  Volterra  kernels  from  input/output 
data,  there  is  always  a danger  that  the  identified  kernels  correspond  to  a mere  curve- 
fit  of  the  data  rather  than  the  true  structure  of  the  system.  Therefore,  the  identified 
kernels  were  validated  by  testing  their  ability  to  predict  the  outputs  corresponding 
to  different  inputs.  The  kernels  were  able  to  successfully  predict  these  outputs  for 
a range  of  input  amplitudes.  However,  as  the  input  amplitude  was  increased,  the 
identified  second-order  kernel  was  no  longer  sufficient  to  characterize  the  nonlinear 
dynamics.  This  is  due  to  the  fact  that  increasing  the  input  amplitude  resulted  in 
greater  contributions  from  the  unmodeled,  higher-order  kernels.  Therefore,  it  is  im- 
portant to  note  that,  in  many  cases,  a model  based  on  a truncated  Volterra  series 
may  be  valid  only  for  a limited  range  of  input  amplitudes. 

Finally,  Volterra  kernels  were  extracted  from  flight  test  data  from  the  Aerostruc- 
tures  Test  Wing  (ATW).  It  was  demonstrated  that  first-order  kernels  can  be  identified 
from  noisy  flight  data.  The  identified  kernels  successfully  captured  the  two  dominant 
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modes  of  the  system.  Attempts  to  identify  a convergent  second-order  kernel  were 
unsuccessful,  however.  This  could  be  explained  by  the  fact  that  the  ATW  is  ba- 
sically a linear  system.  Therefore,  the  second-order  kernel  can  be  expected  to  be 
very  small  and  the  high  level  of  noise  in  the  data  may  have  prevented  the  algorithms 
from  extracting  a kernel.  An  application  of  the  identified  first-order  kernels  to  flutter 
prediction  has  also  been  given. 

The  wavelet-based  algorithms  have  several  limitations.  First,  the  current  im- 
plementations in  MATLAB  have  a limit  as  to  the  number  of  coefficients  that  can 
be  included  in  the  model.  This  imposes  a limit  on  the  resolution  of  the  identified 
kernels.  In  most  of  the  examples,  this  was  not  a problem  as  it  was  demonstrated  that 
the  kernels  could  be  represented  in  terms  of  a relatively  small  number  of  coefficients. 
In  some  cases,  such  as  the  quadratic  nonlinear  system  discussed  in  Section  7.2,  in- 
creasing the  number  of  coefficients  can  actually  make  it  more  difficult  to  obtain  a 
stable  solution.  The  limit  on  the  number  of  coefficients  is  a serious  liability,  however, 
if  the  system  has  a long  memory.  In  that  case,  the  finest  possible  resolution  level 
may  be  very  coarse  since  the  corresponding  kernels  have  longer  duration.  Therefore, 
in  the  current  implementations,  there  are  practical  limits  on  the  lengths  of  the  iden- 
tified kernels.  In  general,  for  a sampling  rate  of  128  Hz  for  the  input  and  output,  the 
algorithms  have  difficulty  dealing  with  kernel  lengths  that  exceed  about  32  seconds 
in  the  first-order  case  and  approximately  8 seconds  in  the  second-order  case.  Both 
algorithms  require  a significant  amount  of  computational  time.  This  is  especially 
true  for  the  triangular  wavelet  implementation.  In  all  cases,  the  computational  time 
increases  with  the  length  of  the  identified  kernels.  The  multiwavelet  implementation 
is  much  faster  than  the  triangular  implementation.  For  the  examples  in  this  disser- 
tation, in  which  the  kernels  were  of  relatively  short  time  duration,  the  second-order 
part  of  the  multiwavelet  algorithm  took  on  the  order  of  5 to  10  minutes  to  run  while 
the  triangular  algorithm  required  on  the  order  of  an  hour.  Another  limitation  of 
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the  wavelet  algorithms  is  that  it  is  extremely  difficult  to  extract  the  kernels  in  some 
cases.  In  the  nonlinear  oscillator  case,  the  identified  second-order  kernels  were  ex- 
tremely sensitive  to  the  point  where  the  singular  values  were  truncated  in  generating 
the  solution.  Therefore,  extracting  the  kernels  was  a tedious  procedure,  requiring  a 
significant  amount  of  trial  and  error  in  truncating  the  singular  values.  In  the  ATW 
case,  convergent  second-order  kernels  were  not  found,  presumably  due  to  the  fact 
that  the  second-order  kernel  was  very  small  for  that  system  and  noise  precluded  the 
identification  of  the  kernel. 

Despite  its  limitations,  the  wavelet-based  kernel  identification  algorithms  have 
shown  considerable  promise  for  the  identification  of  first  and  second-order  kernels  for 
weakly  nonlinear  systems.  There  are  many  possible  directions  for  future  work.  One 
such  area  involves  improving  the  current  implementations  to  make  them  more  efficient 
and  to  increase  the  number  of  coefficients  that  can  be  included  in  the  model.  An 
important  step  that  must  be  taken  is  that  second-order  kernels  need  to  be  identified 
from  experimental  data.  This  is  essential  as  the  goal  of  this  work  is  to  provide  a 
nonlinear  modeling  tool  that  can  be  used  in  engineering  practice.  Another  possibility 
for  future  work  is  to  extend  these  algorithms  to  include  higher-order  kernels.  This 
would  clearly  require  an  increase  in  the  computational  capabilities  of  the  current 
implementations.  There  is  also  a great  deal  of  work  that  can  be  done  in  terms  of  the 
practical  application  of  these  algorithms.  As  discussed  in  Section  7.4,  identified  first- 
order  kernels  from  flight  data  can  be  incorporated  into  the  flutterometer  to  achieve 
more  accurate  flutter  predictions.  A natural  extension  of  this  application,  then,  would 
be  the  identification  of  second-order  kernels  for  aeroelastic  systems  in  order  to  study 
and  predict  nonlinear  phenomena  such  as  limit  cycle  oscillation. 
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